What's wrong my url rewrite regex - regex

I use url rewrite. I add some rule like that:
<add name="Homes" virtualUrl="^/(.*).html" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1" ignoreCase="true" />
<add name="HomeNew" virtualUrl="^/(.*)/(.*)/" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1&idcnew=$2" ignoreCase="true" />
<add name="HomeNewPage" virtualUrl="^/(.*)/(.*)/page-([0-9-]*).htm" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1&idcnew=$2&page=$3" ignoreCase="true" />
<add name="HomeNewNew" virtualUrl="^/(.*)/(.*)/(.*)-([0-9-]*).htm" rewriteUrlParameter="ExcludeFromClientQueryString" destinationUrl="/Default.aspx?vsm=$1&idcnew=$2&idnew=$4" ignoreCase="true" />
I have to use all of them.
In HomeNewPage rule, I use to get catalog new page. In HomeNewNew rule, I use to get content new with $3 is url name of new.
But when go to this link: "/News/Alert/page-2" my request is "vsm=News&idcnew=Alertpage-2"
I want my request is "vsm=News&idcnew=Alert&page=2"
Please help me! What's wrong? And how to fix it?

If the url is like
"/Default/News/Alert/page-2.html"
Regex:- /(.*)/(.*)/(.*)/page-(.*).html
"/Default/News/Alert/page-2"
Regex:- /(.*)/(.*)/(.*)/page-(.*)
and you can make the new url as /Default.aspx?vsm=$2&idcnew=$3&page=$4
Basically each (.*) specify the argument like
Default is $1
News is $2
Alert is $3
But we've hardcoded the page- so its not an argument. So, (.*) after page- comes out to be argument $4
If we take your second expression ^/(.*)/(.*)/. In this case it may creates the problem with url like News/Alert/page-2 because it doesn't "/" in the end. Same is the reason with .html.
As you've not provided the url so I'm not sure about ^. But this will come only if your url will fulfill the regex from first character, however, it wont leads to any problem. You can specify or remove it.

According to me your Regex expression:
/([a-zA-Z0-9-]*)/([a-zA-Z0-9-]*)/page-([0-9-]*)\.htm
will work. But this expression
/([a-zA-Z0-9-]*)/([a-zA-Z0-9-]*)/"
wont work as it does consist of arguement for third tag(page-2). But this will work till /News/Alert. Please check your code or try to debug it according to me there is some hardcoding done. Due to which you're getting error.
I'll give you a very easy way to detect your expressions. After this you'll be able to validate your expressions appropriately.
Go to this link : http://www.rexfiddle.net/
And insert your expression and in second window insert the url you want. After this you'll get captures in the end linke capture 1, capture 2, capture 3. These will be come out to be your arguements($1, $2......). See the screenshot below:

Related

split URL using Regular Expression in IIS

I am facing challenges in splitting the URL using regular expression.
I want to change the mid of the URL part since we changed the URL of the site pages.
https://test.company.com/about/news/2015/test/award.aspx
The above given URL needs to replace as below,
https://test.company.com/en/about/media/news/2015/test/award.aspx
I want to achieve this functionality using Regular Expression in IIS.
I tried the code as below in URL Rewrite in IIS,
about/news/2015(.*.+?)
Help to resolve this as required, thanks in advance.
The regex engine gets the string without the host and protocol, starting with about. Thus, you need to match starting with this fixed string, capture the parts between which you need to insert the required value and use
^(about)(/news/2015/)
Replace with
en/{R:1}/media{R:2}
Where {R:1} refers to about and the {R:2} refers to /news/2015/.
Here is a demo of how this regex works.
You could try below rule:
<rule name="rule11-1" stopProcessing="true">
<match url="^about/news/2015/(.*)" />
<conditions>
<add input="{REQUEST_URI}" pattern="^/about/(.*)" />
</conditions>
<action type="Redirect" url="https://test.company.com/en/about/media/{C:1}" />
</rule>

Regex check if string only has numbers after last instance of character and before last instance of another

I'm trying to write a particular regex pattern for a rewrite rule in IIS and if it matches the pattern to stop processing any more rules.
The Url will look something like this:
somesite.com/somepath/34343.aspx
I need to see if I only have numbers in the section 34343 as I can have
somesite.com/somepath/something343.aspx
I have tried matching the pattern like so:
([0-9]*).aspx$
But this picks up the latter URL and stops processing the rules so later matches aren't run. I need them to run on the later rules and not stop processing.
So if anyone can help, I need some way to check if I only have numbers after the last trailing slash and before the last .(dot)
I have also tried this:
(.)/(.).(.*)
which seems to give me what I want, inasmuch as it gives me grouped matches:
Full Match- somesite.com/somepath/34343.aspx
Group 1- somesite.com/somepath
Group 2- 34343
Group 3- aspx
But I don't know how to use Group 2 to then check that text for only numbers?
Can anyone help please?
Thanks
EDIT
Thanks for the replies, but these two patterns aren't working for me. I plug them into the IIS Url rewrite tool and whilst the rather wonderful test pattern option tells me that they match, they rule just doesn't fire.
<rule name="Ignore id with aspx" enabled="true" stopProcessing="true">
<match url="^(.*\/\d+\..+)$" ignoreCase="true" />
<!--OR-->
<match url="^.*\/\d+\.[^.]+$" ignoreCase="true" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
</rule>
Or at least it doesn't stop processing anymore subsequent rules.
Coincidentally, the rule I said I was using ([0-9]*).aspx$ does fire and does stop processing the subsequent rules.
You can make the regex like this:
(.*\/)+(\d*\..+)
This will check if it contains only digits.
Thanks for the help, but I managed to figure out why it wasn't firing the rule and it appears to now be working.
I setup Failed Request Tracing Rules and noticed that the rule had removed the base url http://somesite.com/ from the check. So it was only looking for the last bit of the Url.
As the Url's in question were http://somesite.com/12345.apsx it was easy to then check for just numbers and .aspx in the request.
<rule name="Ignore id with aspx" enabled="true" stopProcessing="true">
<match url="^\d*\..+[aspx]" ignoreCase="true" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
<action type="None" />
</rule>

Regular expression that matches on optional single querystring parameter

This IIS rewrite rule:
<rule name="music search state">
<match url="^music/state/([a-zA-Z-+']+)?$"/>
<action type="Rewrite" url="search.aspx?stateurl={R:1}&t=2" appendQueryString="true"/>
</rule>
Matches on URL:
www.example.com/music/state/north-dakota
But not on URL:
www.example.com/music/state/north-dakota?country=usa
I also tried:
^music/state/([a-zA-Z-+']+)? and ^music/state/([a-zA-Z-+']+)
But none of these work on both URLs...what do I need to change in this expression so it matches on both?
update
I tried extending the last part of the expression, but these 2 expressions throw a Configuration file is not well-formed XML error in IIS:
([a-zA-Z-+'?\&=]+)
([a-zA-Z-+'?&=]+)
and this expression does not match the URL ([a-zA-Z-+'?&=]+)
In the regex ^music/state/([a-zA-Z-+']+)?$ part [a-zA-Z-+'] is responsible for all characters that can be found after music/state/.
Something like this:^music/state/([a-zA-Z-+'?&=]+)?$
BTW, you can check you regex using online interpreter, like http://regexr.com/

IIS URL Rewrite URLwith Optional Index Page

I've been able to pattern-match a url with or without a trailing slash
<match url="locations/(.+?)/?$" />
However, I want to be able to match locations/[location]/index.aspx as well.
How can I incorporate this optional pattern?
I tried the clumsy:
<match url="(towns/(.+?)/?$)|(towns/(.+?)/index\.aspx$)" />
which wasn't liked at all!
Any help would be appreciated.
This
(?<=locations\/)(.+?)(?=\/|$|\s)
will match [location] in
/locations/[location]/
/locations/[location]
/locations/[location]/index.aspx
/locations/[location]/anything_on_the_planet.html
/locations/[location] <A bunch of text over here>
If I am understanding your question right you want -
with or without trailing slash
To follow locations/
to work regardless what is on the other side of the [location]
If you need anything else, let me know

Regular expression for url rewriting to exclude strings beginning with a year

I have a news page that detects tags based on the query string. So for instance, to filter out all news articles with a tag of 'Popular' I'd have:
<mydomain>/news/?tag=popular
I've set up a url rewrite in my config with the following:
<add name="newsrewrite"
virtualUrl="^~/news/(.*)"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/news?tag=$1"
ignoreCase="true" />
This works fine. However I've noticed that I now can't access specific news article urls because it treats anything after /news/ as a querystring parameter.
ie. if I try to access /news/2015/news-article-1 then it won't work because the rewrite rule is essentially treating 2015/news-article-1 as the parameter.
Since I've structured my news articles under year folders, all news articles will always be accessed via /news/YYYY/article-title where YYYY is a 4-digit year.
Is there a regular expression I can use here that'll take anything after /news/ and use that as the querystring param EXCEPT those that begin with a 4-digit integer?
Thanks!
If you are looking for a regexp that will work like yours with the exception that it won't match /news/YYYY/.. have a look at this:
^\/news\/(?!\d{4})(.*)$
Note: it makes use of a negative lookahead (check if they are supported in your specific case). Also notice escape characters \.
Reading your problem I also though about a different approach: what about mapping through your rewriting only pages that match the actual tag structure? Something like this:
<add name="newsrewrite"
virtualUrl="^~/news/?tag=(.*)"
rewriteUrlParameter="ExcludeFromClientQueryString"
destinationUrl="~/news?tag=$1"
ignoreCase="true" />
note that $1 will contain only the tag (not ?tag=Popular) like in your code. This should match only urls in the form /news/?tag=SOMETHING thus not matching your article pages.