Regex to match file extension while ignoring query string - regex

I'd like to build a IIS url rewrite rule that matches several file extensions but ignore the query string.
Samples:
/hello.html // Match
/test?qs=world.html // Should not match
/test?qs=world.html&qs2=x // Should not match
Here is what I was using that does not work correctly:
<add matchType="Pattern" input="{HTTP_URL}" pattern=".+\.(js|css|less|html|eot|svg|ttf|woff|json|xml)$" negate="true" /> <!--Any url with a dot for file extension-->

Use \w+ instead of .+ .................(\w is equivalent to [a-zA-Z0-9_]):
<add matchType="Pattern" input="{HTTP_URL}" pattern="\w+\.(js|css|less|html|eot|svg|ttf|woff|json|xml)$" negate="true" /> <!--Any url with a dot for file extension-->
If you want to allow other characters (non included in \w) and still ignore query string you can use [^?]+ instead of .+

see if this works for you:
I write it in javascript, (I don't know IIS) but the regex is always similar
/^.*\/[\w]+\.[\w]{2,4}$/.test('/test?qs=world.html') // return false
/^.*\/[\w]+\.[\w]{2,4}$/.test('/world.html') // return true
maybe it work as:
<add matchType="Pattern" input="{HTTP_URL}" pattern="^.*\/[\w]+\.[\w]{2,4}$" negate="true" />
but I put the extension as a generic way. if you want to make a whitebox test for extension, you may replace it to:
^.*\/[\w]+\.(js|css|less|html|eot|svg|ttf|woff|json|xml)$

I assume this will work in IIS: ^[^?]*$ will match any string not containing a ?.

Related

I need help making this regular expression to match part of url

I am writing a rewrite rule in web.config file and want to match against a url (using regular expression) if it containes:
*/admin*
So as long as the url contains above it should match. Example of legal matches:
http://test.com/admin
https://test.com/admin
http://test.com/admin/
http://test.com/admin/test
http://test.com/admin/grgr/hht/
Example of illegal matches:
http://test.com
https://test.com/adminpage
https://test.com/adminpage/
I have tried the followings without success:
<match url="(.*)/admin$" ignoreCase="false" />
<match url="/admin?" ignoreCase="false" />
<match url=".*/admin?" ignoreCase="false" />
Try this regex
.*admin(\/.*|$)
Try something like this
.*\b\/admin\b.*
https://regex101.com/r/zq0SAy/1
Details:
\b asserts position at a word boundary

301 Redirect Regex Pattern

I'm trying to make a IIS redirect rule to redirect from this url pattern, but it beats me:
https://www.mycompanyPLC.com/en/lorem/ipsum/whatever
to
https://www.mycompanyLTD.com/lorem/ipsum/whatever
Basically I need to replace PLC with LTD and if there is the "/en/" group in url, this has to be removed.
You can achieve your both the requirements using the single regex provided /en/ is preceded by .com. Something like:
(.*?)PLC\.com(?:\/\ben\b)?(.*)
Explanation of the above regex:
(.*?) - Represents 1st capturing group capturing everything before PLC lazily.
PLC\.com - Matches PLC.com literally.
(?:\/\ben\b)? - Represents a non-capturing group matching \en literally zero or one time. \b represents a word boundary.
(.*) - Represents the second capturing group matching everything after \en greedily.
$1LTD.com$2 - For the replacement(or redirection in this case) part you can get away with this string where $1 represents the first captured group and $2 represents the second captured group. In your case; you can use {R:1}LTD.com{R:2}.
You can find the demo of the above regex in here.
Please refer to below URL rule.
<system.webServer>
<rewrite>
<rules>
<rule name="ReverseProxyInboundRule1" stopProcessing="true">
<match url="en(.*)" />
<action type="Redirect" url="https://www.mycompanyLTD.com{R:1}" />
</rule>
</rules>
</rewrite>
</system.webServer>
There is no need to match a /en URL fragment forcibly. We redirect the request as long as we found that we have a /en URL segment. so does the http/https URL segment.
Feel free to let me know if there is anything I can help with.
After several hours of lecturing Regex I've created this rule and seems to be working (I've tested several scenarios):
^(http|https)://?(www.)mycompanyPLC.com/en?(.*)
and the Redirect URL from IIS is:
https://www.mycompanyLTD.com/{R:3}
Later edit:
The rule in IIS is like this:
<rule name="Replace PLC with LTD and remove /en/" enabled="true" stopProcessing="true">
<match url="(.*?)PLC\.com(?:\/\ben\b)?(.*)" />
<conditions logicalGrouping="MatchAny" trackAllCaptures="false" />
<action type="Redirect" url="{R:1}ltd.com{R:2}" />
</rule>
Test urls were this format:
http://webdev.myCompanyplc.com/en/our-experience/retail
{R:1} = http://webdev.myCompany
{R:2} = /our-experience/retail
Regex expression was ok, but redirect still didnt work

split URL using Regular Expression in IIS

I am facing challenges in splitting the URL using regular expression.
I want to change the mid of the URL part since we changed the URL of the site pages.
https://test.company.com/about/news/2015/test/award.aspx
The above given URL needs to replace as below,
https://test.company.com/en/about/media/news/2015/test/award.aspx
I want to achieve this functionality using Regular Expression in IIS.
I tried the code as below in URL Rewrite in IIS,
about/news/2015(.*.+?)
Help to resolve this as required, thanks in advance.
The regex engine gets the string without the host and protocol, starting with about. Thus, you need to match starting with this fixed string, capture the parts between which you need to insert the required value and use
^(about)(/news/2015/)
Replace with
en/{R:1}/media{R:2}
Where {R:1} refers to about and the {R:2} refers to /news/2015/.
Here is a demo of how this regex works.
You could try below rule:
<rule name="rule11-1" stopProcessing="true">
<match url="^about/news/2015/(.*)" />
<conditions>
<add input="{REQUEST_URI}" pattern="^/about/(.*)" />
</conditions>
<action type="Redirect" url="https://test.company.com/en/about/media/{C:1}" />
</rule>

Use Regex to lookahead and redirect in case of match IIS

I have the following IIS rule which is supposed to redirect if the URI does not contain the word Api:
<rule name="React Routes" stopProcessing="true">
<match url=".*" />
<conditions logicalGrouping="MatchAll" trackAllCaptures="false">
<add input="{REQUEST_URI}" pattern="^((?!Api).)*$" negate="false" />
</conditions>
<action type="Rewrite" url="/" />
</rule>
This was working fine until I added a token as a query parameter for a route. Now when it tries to match that URI it will go out of memory.
How would I have to write the pattern so it looks only in the first 30 characters? The /Api/ route will never appear later. This way I will make sure that the regular expression matching does not run out of memory when a token is present.
To make sure Api does not occur within the first 30 chars you may use
pattern="^(?!.{0,27}Api).*"
Details
^ - start of string
(?!.{0,27}Api) - a negative looakahead that matches a location that is not immediately followed with any 0 to 27 chars (other than linebreak chars) and Api after them
.* - any 0+ chars (other than linebreak chars).

Regular expression to filter 2 urls

Can I have a filter (or match) only the below mentioned 2 urls by using Regular expression?
http://www.domain.com/owner/Marketing and http://www.domain.com/owner/getinfo )?
UPDATE Usage is as below.
<rule name="Skip HTTPS" enabled="true" stopProcessing="true">
<match url="I need regex here" ignoreCase="true" />
<conditions>
<add input="{HTTP}" pattern="OFF" />
</conditions>
<action type="None" />
</rule>
UPDATE 2:
If I put this way,Will it work?
<match url="(Marketing|getinfo)" ignoreCase="true" />
If i understand correctly what you want, then it is strait forward: (http:\/\/www\.domain\.com\/owner\/Marketing | http:\/\/www\.domain\.com\/owner\/getinfo). see http://regex101.com/r/yW3aP9
You can use the pipe | character for alternation and parenthesis. Add a leading ^ and trailing $ "binds" the expression to only match the exact URLs with no leading or trailing garbage and escape the slashes (which acts a RegEx delimiters) and dots (. which match any character). So:
/^http:\/\/www\.domain\.com\/owner\/(Marketing|getinfo)$/
ok this one seems to work,
/http:\/\/www\.domain\.com\/owner\/(Marketing|getinfo)/
but regexpal is not matching the strings up when you compare the entire string with a carett and dollar sign ^$, maybe a more experienced person will know why