How to write regex to remove last parameter from a URL - regex

I am trying to create URL rewrite inbound rule for IIS to return the URL before a specific parameter.
Edited: I should have stated the authProvider is always the last parameter.
Example:
http://localhost/WebAccess/Default.ashx?accessionNumber=009&authProvider=Bypass
I want to trim off &authProvider=Bypass from the end of the URL
I've tried:
.*(?=&authProvider)
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<system.webServer>
<rewrite>
<rules>
<rule name="Test" enabled="true">
<match url="(.*)&authProvider.+" />
<action type="Rewrite" url="{R:0}" logRewrittenUrl="true" />
</rule>
</rules>
</rewrite>
</system.webServer>
</configuration>

The example that you have shown will match what you want, but not change the URL. The result of your match should have the matched string that you want as the result.
I think that the problem may be that you are trying to replace what you match, but since the forward lookup (?= is not part of the match result) when you do the replace you are ending up with the same string as when you started. As an alternative, assuming that you are aware that this will not be very robust if parameter orders change, you could use:
(.*)&authProvider.+
Then replace with
$1
This will result in:
http://localhost/WebAccess/Default.ashx?accessionNumber=009
Essentially it matches the whole string and replaces it with everything before $auth, which is in group 1 ($1).
Update
With your update, I see that the rewrite rule syntax uses {R:1} so $1 should be {R:1} in my example and in your Rewrite Rule should be {R:1}. See here for an example.

Related

301 Redirect Regex Pattern

I'm trying to make a IIS redirect rule to redirect from this url pattern, but it beats me:
https://www.mycompanyPLC.com/en/lorem/ipsum/whatever
to
https://www.mycompanyLTD.com/lorem/ipsum/whatever
Basically I need to replace PLC with LTD and if there is the "/en/" group in url, this has to be removed.
You can achieve your both the requirements using the single regex provided /en/ is preceded by .com. Something like:
(.*?)PLC\.com(?:\/\ben\b)?(.*)
Explanation of the above regex:
(.*?) - Represents 1st capturing group capturing everything before PLC lazily.
PLC\.com - Matches PLC.com literally.
(?:\/\ben\b)? - Represents a non-capturing group matching \en literally zero or one time. \b represents a word boundary.
(.*) - Represents the second capturing group matching everything after \en greedily.
$1LTD.com$2 - For the replacement(or redirection in this case) part you can get away with this string where $1 represents the first captured group and $2 represents the second captured group. In your case; you can use {R:1}LTD.com{R:2}.
You can find the demo of the above regex in here.
Please refer to below URL rule.
<system.webServer>
<rewrite>
<rules>
<rule name="ReverseProxyInboundRule1" stopProcessing="true">
<match url="en(.*)" />
<action type="Redirect" url="https://www.mycompanyLTD.com{R:1}" />
</rule>
</rules>
</rewrite>
</system.webServer>
There is no need to match a /en URL fragment forcibly. We redirect the request as long as we found that we have a /en URL segment. so does the http/https URL segment.
Feel free to let me know if there is anything I can help with.
After several hours of lecturing Regex I've created this rule and seems to be working (I've tested several scenarios):
^(http|https)://?(www.)mycompanyPLC.com/en?(.*)
and the Redirect URL from IIS is:
https://www.mycompanyLTD.com/{R:3}
Later edit:
The rule in IIS is like this:
<rule name="Replace PLC with LTD and remove /en/" enabled="true" stopProcessing="true">
<match url="(.*?)PLC\.com(?:\/\ben\b)?(.*)" />
<conditions logicalGrouping="MatchAny" trackAllCaptures="false" />
<action type="Redirect" url="{R:1}ltd.com{R:2}" />
</rule>
Test urls were this format:
http://webdev.myCompanyplc.com/en/our-experience/retail
{R:1} = http://webdev.myCompany
{R:2} = /our-experience/retail
Regex expression was ok, but redirect still didnt work

Rewrite Map Rule to match any URL extension

I'm fairly new to rewrite maps, but we did get ours to work on a very basic level. After a website redesign, we set up an extensive rewrite map (thousands o rules) to point the old pages to the new ones. The trouble we're having is that we're having to add multiple values for the same page in order for the rewrite to work.
Example:
http://www.abc123.com/About --> http://www.abc123.com/about-us
http://www.abc123.com/About.aspx --> http://www.abc123.com/about-us
http://www.abc123.com/about/ --> http://www.abc123.com/about-us
http://www.abc123.com/about.aspx --> http://www.abc123.com/about-us
There should be a way to wildcard anything after the base URL in the regular expression - I'm expecting something like this: ^./[about]$ which would be great if ALL urls contained "about" but they don't.
Also note that we aren't redirecting by directory, but rather by file name. It's that our CMS is set up not to use the .aspx extension, so any extension will work.
What I want is to only have to have ONE rule for each URL that looks like:
"http://www.abc123.com/about" and it will point all of the above variations to the new URL regardless if it does not have an extension or if the extension is .html, .asp, .aspx, or .whatever
Is that beyond the capabilities of the rewrite rules or is there some basic regular expression I am missing?
Here is the rule we are using:
<rule name="Redirect Rule for Legacy Redirects" enabled="true" stopProcessing="true">
<match url=".*" />
<conditions>
<add input="{Redirects:{REQUEST_URI}}" pattern="(.+)" />
</conditions>
<action type="Redirect" url="{C:1}" appendQueryString="false" />
</rule>
Any insight would be much appreciated.
[Hh][Tt][Tt][Pp]://(([^/])/)[Aa][Bb][Oo][Uu][Tt].*
See https://regex101.com/r/rZhJyz/1, just append "about-us" to the Group #1 match.
Had to figure this out today. It can be done by using the <match url> regex to strip off the extension, and then using the matched portion from here as the input for the rewrite map lookup.
The rewrite map keys must NOT have a starting /.
The rule (and sample map) looks like this, example for stripping off .aspx extension (could be generalised):
<rewrite>
<rewriteMaps>
<rewriteMap name="Test">
<add key="test" value="http://www.google.com" />
<add key="test.aspx" value="http://www.google.com" />
</rewriteMap>
</rewriteMaps>
<rules>
<rule name="Rewrite Map Optional Aspx Extension" stopProcessing="true">
<match url="^(.*?)(\.aspx)?$" />
<conditions>
<add input="{Test:{R:1}}" pattern="(.+)" />
</conditions>
<action type="Redirect" url="{C:1}" appendQueryString="false" />
</rule>
</rules>
</rewrite>
The important changes from a standard rewrite map rule are:
Adding (\.aspx)? as an optional part of the match url, and have have added ? to .* to make the initial .* not greedy so it doesn't include the extension.
Changed {Test:{REQUEST_URI}} to {Test:{R:1}} so it uses the matched input from the match url (.*)
Take out leading / from rewrite map keys

Regex check if string only has numbers after last instance of character and before last instance of another

I'm trying to write a particular regex pattern for a rewrite rule in IIS and if it matches the pattern to stop processing any more rules.
The Url will look something like this:
somesite.com/somepath/34343.aspx
I need to see if I only have numbers in the section 34343 as I can have
somesite.com/somepath/something343.aspx
I have tried matching the pattern like so:
([0-9]*).aspx$
But this picks up the latter URL and stops processing the rules so later matches aren't run. I need them to run on the later rules and not stop processing.
So if anyone can help, I need some way to check if I only have numbers after the last trailing slash and before the last .(dot)
I have also tried this:
(.)/(.).(.*)
which seems to give me what I want, inasmuch as it gives me grouped matches:
Full Match- somesite.com/somepath/34343.aspx
Group 1- somesite.com/somepath
Group 2- 34343
Group 3- aspx
But I don't know how to use Group 2 to then check that text for only numbers?
Can anyone help please?
Thanks
EDIT
Thanks for the replies, but these two patterns aren't working for me. I plug them into the IIS Url rewrite tool and whilst the rather wonderful test pattern option tells me that they match, they rule just doesn't fire.
<rule name="Ignore id with aspx" enabled="true" stopProcessing="true">
<match url="^(.*\/\d+\..+)$" ignoreCase="true" />
<!--OR-->
<match url="^.*\/\d+\.[^.]+$" ignoreCase="true" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
</rule>
Or at least it doesn't stop processing anymore subsequent rules.
Coincidentally, the rule I said I was using ([0-9]*).aspx$ does fire and does stop processing the subsequent rules.
You can make the regex like this:
(.*\/)+(\d*\..+)
This will check if it contains only digits.
Thanks for the help, but I managed to figure out why it wasn't firing the rule and it appears to now be working.
I setup Failed Request Tracing Rules and noticed that the rule had removed the base url http://somesite.com/ from the check. So it was only looking for the last bit of the Url.
As the Url's in question were http://somesite.com/12345.apsx it was easy to then check for just numbers and .aspx in the request.
<rule name="Ignore id with aspx" enabled="true" stopProcessing="true">
<match url="^\d*\..+[aspx]" ignoreCase="true" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false" />
<action type="None" />
</rule>

Regular expression to filter 2 urls

Can I have a filter (or match) only the below mentioned 2 urls by using Regular expression?
http://www.domain.com/owner/Marketing and http://www.domain.com/owner/getinfo )?
UPDATE Usage is as below.
<rule name="Skip HTTPS" enabled="true" stopProcessing="true">
<match url="I need regex here" ignoreCase="true" />
<conditions>
<add input="{HTTP}" pattern="OFF" />
</conditions>
<action type="None" />
</rule>
UPDATE 2:
If I put this way,Will it work?
<match url="(Marketing|getinfo)" ignoreCase="true" />
If i understand correctly what you want, then it is strait forward: (http:\/\/www\.domain\.com\/owner\/Marketing | http:\/\/www\.domain\.com\/owner\/getinfo). see http://regex101.com/r/yW3aP9
You can use the pipe | character for alternation and parenthesis. Add a leading ^ and trailing $ "binds" the expression to only match the exact URLs with no leading or trailing garbage and escape the slashes (which acts a RegEx delimiters) and dots (. which match any character). So:
/^http:\/\/www\.domain\.com\/owner\/(Marketing|getinfo)$/
ok this one seems to work,
/http:\/\/www\.domain\.com\/owner\/(Marketing|getinfo)/
but regexpal is not matching the strings up when you compare the entire string with a carett and dollar sign ^$, maybe a more experienced person will know why

How to redirect subfolder to query in Mod Rewrite for IIS 7.0?

I'm using Mod Rewrite for IIS 7.0 from iis.net and want to redirect requests:
http://example.com/users/foo to http://example.com/User.aspx?name=foo
http://example.com/users/1 to http://example.com/User.aspx?id=1
I have created 2 rules:
<rule name="ID">
<match url="/users/([0-9])" />
<action type="Rewrite" url="/User.aspx?id={R:1}" />
</rule>
<rule name="Name">
<match url="/users/([a-z])" ignoreCase="true" />
<action type="Rewrite" url="/User.aspx?name={R:1}" />
</rule>
It passes a test into iis mmc test dialog, but doesn't in debug (URL like http://localhost:9080/example.com/users/1 or …/users/foo) and doesn't on real IIS!
What have I done wrong?
The obvious problem is that your current regexes only match one character in the user name or one number. You'll need to add a plus quantifier inside the parentheses in order to match multiple letters or numbers. See this page for more info about regex quantifiers. Note that you won't be matching plain URLs like "/users/" (no ID or name). Make sure this is what you intended.
The other problem you're running into is that IIS evaluates rewrite rules starting from the first character after the initial slash. So your rule to match /users/([0-9]) won't match anything because when the regex evaluation happens, the URL looks like users/foo not /users/foo. The solution is to use ^ (which is the regex character that means "start of string") at the start of the pattern instead of a slash. Like this:
<rule name="ID">
<match url="^users/([0-9]+)" />
<action type="Rewrite" url="/User.aspx?id={R:1}" />
</rule>
<rule name="Name">
<match url="^users/([a-z]+)" ignoreCase="true" />
<action type="Rewrite" url="/Users.aspx?name={R:1}" />
</rule>
Note that you're choosing Users.aspx for one of these URLs and User.aspx (no plural) for the other. Make sure this is what you intended.
BTW, the way I figured these things out was by using IIS Failed Request Tracing to troubleshoot rewrite rules. This made diagnosing this really easy. I was able to make a test request and look through the trace to find where each rewrite rule is being evaluated (it's in a section of the trace called "PATTERN_MATCH". For the particular PATTERN_MATCH for one of your rules, I saw this:
-PATTERN_MATCH
Pattern /users/([0-9]+?)
InputURL users/1
Negate false
Matched false
Note the lack of the beginning slash.
You should use <match url="/users/([0-9]+)" /> and <match url="/users/([a-z]+)" ignoreCase="true" />, respectively, to match the complete id/user and not just their first letter/digit. But I don't know why your regex would have failed on a single digit, so there must be another issue, too.
As for your second question, I'm not sure I understand completely. How can you tell the difference between a folder name and a user name? Will a folder always have a trailing slash?