IIS URL rewrite not working properly 404 errors - regex

I am upgrading a joomla website setup on IIS 10. Now I have oldsite.com and a newsite.com. My new site has slightly different folder structure but page names and content is same. Rightly so client doesn't want to lose SEO ranking on the old pages and want to redirect them to correct one on the new upgraded site.
i need to do following
is wildcard and will get replaced with whatever will be typed in the URL in it’s place
/div-services/* will redirect to /div/*
/div-questions/* will redirect to /div/questions/*
/fm-lw-services/* will redirect to /fm-lw/*
/locations/* will redirect to /contact/*
/resources/blog/* will redirect to /blog/*
/contact-us/* will redirect to /contact/*
I initially setup my pattern to
(.*)(div-services)(.*) becomes {R:1}( div){R:1}
It worked well till I have matching phrase to repeat in some form in the url. which in this case is “div-services” coming again in the URL, it gets replaced as well.
For example if the url is newsite.com/div-services/xyz/abc-div-services then per the rule it will replace both occurrences of “div-services” which is not desired I only need to replace the first occurrence. I though it’s a easy fix and made my pattern as following
(.*)(/div-services/)(.*) replace to {R:1}(/div/){R:1}
Even though in the test pattern it validates with success but it just doesn’t work and does not re write the URL I even tried with the escape character
(.*)(\/div-services\/)(.*) becomes {R:1}(/div/){R:1}
Still no luck. After digging and digging I found following example
div-services/(.*)$ becomes div/{R:1}
this worked generally well but now if I don’t have the ending forward slash it won’t work
for example if URL is is newsite.com/div-services it won’t work but is newsite.com/div-services/ and is newsite.com/div-services/xyx will work fine.
I am just at loss, any help will be much appreciated. I just don’t understand why can’t I detect the forward slash /

fyi I figured why this was not working (.*)(/div-services/)(.*) becomes {R:1}(/div/){R:1}
it is because input start the after the first forward slash, i was assuming that it would be the whole URL that is why my regular expression validates but actually doesn't work. As when it run it is only taking URL after the first slash, that clarifies so many things and logical explanation on why many of my patterns were not working even though they would pass the pattern test. Hopefully it save others the hours and hours i wasted because i didn't have clear understand how it was working
enter image description here

Related

Regex to match a URL with parameters but reject URL with subfolder

Short Question: What regex statement will match the parameters on a URL but not match a subfolder? For example match google.com?parameter but not match google.com/subdomain
Long Question: I am re-directing a few URLs on a site.
I want a request to ilovestarwars.com/page2 to re-direct to ilovestarwars.com/forceawakens
I setup this re-direct and it works great most of the time. The problem is when there are URL parameters. For example if someone sends the URL using an email program that tracks links. Then ilovestarwars.com/page2 becomes ilovestarwars.com/page2?parameter=trackingcode123 after they send it which results in a 404 on my site because it is looking for the exact URL.
No problem, I will just use Regex. So I now re-direct using ilovestarwars.com/page2(.*) and it works great accepts all the parameters, no more 404s.
However, trying to future proof my work, I am worried, what happens if someone adds content inside the page2 folder? For example ilovestarwars.com/page2/mistake
They shouldn't, but if they do, it will take them forever to figure out why it is redirecting.
So my question is, how can I create a regex statement that will match the parameters but reject a subfolder?
I tried page2(.*?)/ as is suggested in this answer, but https://www.regex101.com/ says the slash is an unescaped delimiter.
Background info as suggested here, I am using Wordpress and the Redirection plugin. This is the article that goes over the initial redirect I setup.
A direct answer to your question would be something like this: ^/([^?&/\]*)(.*)$
This assumes the string starts at the first / (if it doesn't, remove the / that follows the ^). In the first capture group you will get the page name (page2, in the case of your example URL) and in the second capture group, you will get the remaining part of the url (anything following one of these chars: ?, &, /, \). If you don't care about the second capture group, use ^/([^?&/\]*).*$
An indirect answer would be that you don't do it this way. Instead, there should be an index page in folder page2 that uses a 301 redirect to redirect to the proper page. It would make much more sense to do it statically. I understand that you may not have that much control over your webpage, though, since it is Wordpress, in which case the former answer should work with the given plugin.

matching only numbers in a regex string for redirect

I am using a redirection plugin for wordpress ad have no experience with regex.
I have a url that can have anything after the url, but I only want to redirect if only numbers appear and nothing else, such that of the following urls only the last one would get a match:
http://j.net/contact
http://j.net/c4t
http://j.net/4con
http://j.net/4co5
http://j.net/anything/123 * this should fail
http://j.net/456 * this should pass
I came up with this:
(\d+)$
to:
article/$1
But I ended up in an infinite loop.
Edit: the loop seems to come into play when navigating to:
http://j.net/1289
Or:
http://j.net/dribble/1289
Your solution seems to work fine, after escaping the / character
See the example: http://regex101.com/r/cX4bV6/2
PS. i'm not sure what language you are using and whether wordpress would support it.

Regex to change old url to new with wordpress redirection

I want to redirect for example
www.mydomain.com/my-profile.html?userId=18681
to
www.mydomain.com/members
what shall i put in my Source URL?
I have more than 2000 404 errors on webmaster because i changed from cms to cms, so i want to fix my redirection regex so not to enter the errors one bye one because I have
/my-profile.html?userId=18681
/my-profile.html?userId=12451
/my-profile.html?userId=9251
How can i make it general so it automatic redirects all to www.mydomain.com/members
I use this plugin http://wordpress.org/plugins/redirection/
I'm not sure how you're going about implementing the redirect. But from a purely regex standpoint, If I wanted to convert the top url format to the one you put below it, here is the find-and-replace format I would use:
s/(my-.+\d+)$/members/
So find 'my-', then one or more of any character, then ENDING with one or more digits. Replace that (starting with my- and ending with the digits) with 'members'.
Sorry if this does not solve your issue, and keep in mind this is 'perl compatible' format for regex, find-and-replace may (likely) be a formatted differently for the language you are implementing this with.

regex rewrite url cluster

I've been trying to learn regex and its terribly complicated. I'm not even positive that it's possible to rewrite these URLs without doing them individually. I can do them individually (search & replace) but there are a few different clusters and there are 1000's of URLs (migration).
This is a Joomla site running acesef software. Here is an example URL from 1 particular cluster. The end of the URL is identical for old and new URL. Only the beginning directories have changed. So is there a way to match the end of the URL for all URLs in those particular directories from old to new and rewrite it with a single expression?
Old URL = www.domain.com/property-details/condominiums/3448-page-title
New URL = www.domain.com/bangkok/condos/rent/3448-page-title
I won't even bother posting what I've tried to write so far, because its so far off. I'm trying to get my feet wet with regex but this is a pretty complicated rewrite for a beginner.
Well uh, at face value you could just use this:
[^/]+$
This will give you anything after the last / so in your example, you'd get 3448-page-title

RegEx to Strip text from middle of URL

I have a client that's moving her site from Blogger to a Wordpress blog. I put some code in Blogger to redirect visitors to the right Wordpress post/page. So when Blogger redirects a post, it comes through as www.domain.com/?bloggerURL=/yyyy/mm/the-post-slug.html
With this Regex, I'm successfully returning the proper Wordpress URL: www.domain.com/yyyy/mm/the-post-slug (with "?bloggerURL=/" and ".html" removed)
^\?bloggerURL=/(.*)\.html$
Blogger Pages get redirected like www.domain.com/?bloggerURL=/p/the-page-slug.html I tried just adding the /p to the Regex to strip out this case also, but it's not working.
^\?bloggerURL=/p/(.*)\.html$
For instance, www.domain.com/?bloggerURL=/p/about.html should be redirected to www.domain.com/about, but the URL is remaining as www.domain.com/?bloggerURL=/p/about.html
I'm probably missing something simple to get it to pick up the first part of the string and remove it. Is there something I need to add/remove to get that case working?
Just going through old, unanswered questions... hopefully you've solved this one, but in case not, the only problem I see with the regex is the non-escaped slashes. I think you would need to use:
^\?bloggerURL=\/p\/(.*)\.html$
Hope this might help someone else.