IIS URL Rewrite URLwith Optional Index Page - regex

I've been able to pattern-match a url with or without a trailing slash
<match url="locations/(.+?)/?$" />
However, I want to be able to match locations/[location]/index.aspx as well.
How can I incorporate this optional pattern?
I tried the clumsy:
<match url="(towns/(.+?)/?$)|(towns/(.+?)/index\.aspx$)" />
which wasn't liked at all!
Any help would be appreciated.

This
(?<=locations\/)(.+?)(?=\/|$|\s)
will match [location] in
/locations/[location]/
/locations/[location]
/locations/[location]/index.aspx
/locations/[location]/anything_on_the_planet.html
/locations/[location] <A bunch of text over here>
If I am understanding your question right you want -
with or without trailing slash
To follow locations/
to work regardless what is on the other side of the [location]
If you need anything else, let me know

Related

IIS URL Rewrite to Redirect PDF File (Regex)

Server: IIS 8
I have a URL Rewrite rile to redirect PDF files to a page that will handle some additional processing. Everything works good except when the PDF files have special characters or spaces in them, then the destination page only gets the characters right of any spaces or special characters.
As an example, see the following filename:
Receipt - Hard Drive.PDF
the receiving page (/getfile/?PDF=) would receive only
Drive.PDF
I have tried various regex methods but as with most people my regex knowledge is pretty terrible.
How can I write a 'match URL' that will accept all filenames (at least those accepted by Windows, such as filenames with underscores, dashes, spaces, single quotes, double quotes, pound signs, etc). Is there any way to write something universal that simply passes all characters regardless of what they are, since I really only want to match *.pdf? My current rule is below.
<rule name="PDF Rewrite" stopProcessing="true">
<match url="([\w-]+)\.pdf$" />
<action type="Redirect" url="/getfile/?PDF={R:1}.pdf" logRewrittenUrl="true" redirectType="Temporary" />
</rule>
You may use
<match url="(.+)\.pdf$" />
The .+ matches one or more characters other than line break chars as many as possible.

Regex check if string only has numbers after last instance of character and before last instance of another

I'm trying to write a particular regex pattern for a rewrite rule in IIS and if it matches the pattern to stop processing any more rules.
The Url will look something like this:
somesite.com/somepath/34343.aspx
I need to see if I only have numbers in the section 34343 as I can have
somesite.com/somepath/something343.aspx
I have tried matching the pattern like so:
([0-9]*).aspx$
But this picks up the latter URL and stops processing the rules so later matches aren't run. I need them to run on the later rules and not stop processing.
So if anyone can help, I need some way to check if I only have numbers after the last trailing slash and before the last .(dot)
I have also tried this:
(.)/(.).(.*)
which seems to give me what I want, inasmuch as it gives me grouped matches:
Full Match- somesite.com/somepath/34343.aspx
Group 1- somesite.com/somepath
Group 2- 34343
Group 3- aspx
But I don't know how to use Group 2 to then check that text for only numbers?
Can anyone help please?
Thanks
EDIT
Thanks for the replies, but these two patterns aren't working for me. I plug them into the IIS Url rewrite tool and whilst the rather wonderful test pattern option tells me that they match, they rule just doesn't fire.
<rule name="Ignore id with aspx" enabled="true" stopProcessing="true">
<match url="^(.*\/\d+\..+)$" ignoreCase="true" />
<!--OR-->
<match url="^.*\/\d+\.[^.]+$" ignoreCase="true" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
</rule>
Or at least it doesn't stop processing anymore subsequent rules.
Coincidentally, the rule I said I was using ([0-9]*).aspx$ does fire and does stop processing the subsequent rules.
You can make the regex like this:
(.*\/)+(\d*\..+)
This will check if it contains only digits.
Thanks for the help, but I managed to figure out why it wasn't firing the rule and it appears to now be working.
I setup Failed Request Tracing Rules and noticed that the rule had removed the base url http://somesite.com/ from the check. So it was only looking for the last bit of the Url.
As the Url's in question were http://somesite.com/12345.apsx it was easy to then check for just numbers and .aspx in the request.
<rule name="Ignore id with aspx" enabled="true" stopProcessing="true">
<match url="^\d*\..+[aspx]" ignoreCase="true" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
<action type="None" />
</rule>

Regular expression for url rewriting to exclude strings beginning with a year

I have a news page that detects tags based on the query string. So for instance, to filter out all news articles with a tag of 'Popular' I'd have:
<mydomain>/news/?tag=popular
I've set up a url rewrite in my config with the following:
<add name="newsrewrite"
virtualUrl="^~/news/(.*)"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/news?tag=$1"
ignoreCase="true" />
This works fine. However I've noticed that I now can't access specific news article urls because it treats anything after /news/ as a querystring parameter.
ie. if I try to access /news/2015/news-article-1 then it won't work because the rewrite rule is essentially treating 2015/news-article-1 as the parameter.
Since I've structured my news articles under year folders, all news articles will always be accessed via /news/YYYY/article-title where YYYY is a 4-digit year.
Is there a regular expression I can use here that'll take anything after /news/ and use that as the querystring param EXCEPT those that begin with a 4-digit integer?
Thanks!
If you are looking for a regexp that will work like yours with the exception that it won't match /news/YYYY/.. have a look at this:
^\/news\/(?!\d{4})(.*)$
Note: it makes use of a negative lookahead (check if they are supported in your specific case). Also notice escape characters \.
Reading your problem I also though about a different approach: what about mapping through your rewriting only pages that match the actual tag structure? Something like this:
<add name="newsrewrite"
virtualUrl="^~/news/?tag=(.*)"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/news?tag=$1"
ignoreCase="true" />
note that $1 will contain only the tag (not ?tag=Popular) like in your code. This should match only urls in the form /news/?tag=SOMETHING thus not matching your article pages.

What's wrong my url rewrite regex

I use url rewrite. I add some rule like that:
<add name="Homes" virtualUrl="^/(.*).html" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1" ignoreCase="true" />
<add name="HomeNew" virtualUrl="^/(.*)/(.*)/" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1&idcnew=$2" ignoreCase="true" />
<add name="HomeNewPage" virtualUrl="^/(.*)/(.*)/page-([0-9-]*).htm" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1&idcnew=$2&page=$3" ignoreCase="true" />
<add name="HomeNewNew" virtualUrl="^/(.*)/(.*)/(.*)-([0-9-]*).htm" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1&idcnew=$2&idnew=$4" ignoreCase="true" />
I have to use all of them.
In HomeNewPage rule, I use to get catalog new page. In HomeNewNew rule, I use to get content new with $3 is url name of new.
But when go to this link: "/News/Alert/page-2" my request is "vsm=News&idcnew=Alertpage-2"
I want my request is "vsm=News&idcnew=Alert&page=2"
Please help me! What's wrong? And how to fix it?
If the url is like
"/Default/News/Alert/page-2.html"
Regex:- /(.*)/(.*)/(.*)/page-(.*).html
"/Default/News/Alert/page-2"
Regex:- /(.*)/(.*)/(.*)/page-(.*)
and you can make the new url as /Default.aspx?vsm=$2&idcnew=$3&page=$4
Basically each (.*) specify the argument like
Default is $1
News is $2
Alert is $3
But we've hardcoded the page- so its not an argument. So, (.*) after page- comes out to be argument $4
If we take your second expression ^/(.*)/(.*)/. In this case it may creates the problem with url like News/Alert/page-2 because it doesn't "/" in the end. Same is the reason with .html.
As you've not provided the url so I'm not sure about ^. But this will come only if your url will fulfill the regex from first character, however, it wont leads to any problem. You can specify or remove it.
According to me your Regex expression:
/([a-zA-Z0-9-]*)/([a-zA-Z0-9-]*)/page-([0-9-]*)\.htm
will work. But this expression
/([a-zA-Z0-9-]*)/([a-zA-Z0-9-]*)/"
wont work as it does consist of arguement for third tag(page-2). But this will work till /News/Alert. Please check your code or try to debug it according to me there is some hardcoding done. Due to which you're getting error.
I'll give you a very easy way to detect your expressions. After this you'll be able to validate your expressions appropriately.
Go to this link : http://www.rexfiddle.net/
And insert your expression and in second window insert the url you want. After this you'll get captures in the end linke capture 1, capture 2, capture 3. These will be come out to be your arguements($1, $2......). See the screenshot below:

Regex URL rewriting for custom login page

I'm implementing a custom login page for a multitenant portal where each client gets a different login page styled according to their stored settings.
To achieve this I am using IIS 7.5 with the URL Rewriting module.
My idea is to capture requests for "http://portal.com/client1/" and rewrite them into "http://portal.com/login.aspx?client=client1".
What I'm struggling with is the regex expression to match the URL and extract the "client1" bit out.
EXAMPLES:
"http://portal.com/pepsi" = "http://portal.com/login.aspx?client=pepsi"
"http://portal.com/fedex" = "http://portal.com/login.aspx?client=fedex"
"http://portal.com/northwind" = "http://portal.com/login.aspx?client=northwind"
"http://portal.com/microsoft/" = "http://portal.com/login.aspx?client=microsoft"
So the match should be found if the requested URL contains a single word after the first "/" and work whether there is a trailing "/" or not.
"http://portal.com/clients/home.aspx" would be ignored by the rule.
"http://portal.com/clients/catalog" would be ignored by the rule.
"http://portal.com/products.aspx" would be ignored by the rule.
Assuming:
That the parameter name is always client
and that you don't care about what is after /client1/ then you can use this simple pattern to capture that portion of the URL and then repeat it as a parameter
here:
<rewrite>
<rules>
<rule name="client1 rewrite">
<match url="^([^/.]*)[/]*$" />
<action type="rewrite" url="login.aspx?client={R:1}"/>
</rule>
</rules>
</rewrite>
Fiddle
This works because in all of the ignore list there is a slash in the "middle", but the slash is optional at ending of the "filter" list. So {R:1} will contain everything up to the first slash or end of url if there is no slash.