Regex to match a URL with parameters but reject URL with subfolder - regex

Short Question: What regex statement will match the parameters on a URL but not match a subfolder? For example match google.com?parameter but not match google.com/subdomain
Long Question: I am re-directing a few URLs on a site.
I want a request to ilovestarwars.com/page2 to re-direct to ilovestarwars.com/forceawakens
I setup this re-direct and it works great most of the time. The problem is when there are URL parameters. For example if someone sends the URL using an email program that tracks links. Then ilovestarwars.com/page2 becomes ilovestarwars.com/page2?parameter=trackingcode123 after they send it which results in a 404 on my site because it is looking for the exact URL.
No problem, I will just use Regex. So I now re-direct using ilovestarwars.com/page2(.*) and it works great accepts all the parameters, no more 404s.
However, trying to future proof my work, I am worried, what happens if someone adds content inside the page2 folder? For example ilovestarwars.com/page2/mistake
They shouldn't, but if they do, it will take them forever to figure out why it is redirecting.
So my question is, how can I create a regex statement that will match the parameters but reject a subfolder?
I tried page2(.*?)/ as is suggested in this answer, but https://www.regex101.com/ says the slash is an unescaped delimiter.
Background info as suggested here, I am using Wordpress and the Redirection plugin. This is the article that goes over the initial redirect I setup.

A direct answer to your question would be something like this: ^/([^?&/\]*)(.*)$
This assumes the string starts at the first / (if it doesn't, remove the / that follows the ^). In the first capture group you will get the page name (page2, in the case of your example URL) and in the second capture group, you will get the remaining part of the url (anything following one of these chars: ?, &, /, \). If you don't care about the second capture group, use ^/([^?&/\]*).*$
An indirect answer would be that you don't do it this way. Instead, there should be an index page in folder page2 that uses a 301 redirect to redirect to the proper page. It would make much more sense to do it statically. I understand that you may not have that much control over your webpage, though, since it is Wordpress, in which case the former answer should work with the given plugin.

Related

In Chrome, eventually using redirect extension with regex, how to redirect all facebook.com URLs to mbasic.facebook.com URLS

Have been searching for days for this, seems like none has had the same idea as mine regarding redirecting Facebook urls from the normal website to the mbasic version site.
My idea is to redirect all "normal" facebook.com urls into mbasic.facebook.com ones at ALL times. Both when i click links on the web, and when i enter an url in the adress bar. Preferably matching facebook.com, www.facebook.com, http(s)://facebook.com, http(s)://www.facebook.com
-- essentially, let them all have a "mbasic." before "facebook" and never see the ordinary facebook site again, but only use the basic version.
.
I've found some redirector extensions in Chrome that uses regex (right now using Requestly), and I think i'm close, but this regex seems invalid:
^((http(s)?:\/\/)?)((www.)?)facebook.com(\/)?.*$
This is what it matches for me at RegExr: https://i.imgur.com/eljJ1vP.png .
Also tried this one, but is also invalid apparently:
^(http://)*(www.)*((?!mbasic).)*.$
Other times using either regex or wildcard matching, I could get a query for facebook.com to change to mbasic.facebook.com, but whenever I entered, say, an event page (facebook.com/events/ID), it would not redirect, or, the "mbasic" part would be repeated, resulting in a "mbasic.mbasic.facebook.com" redirect.
Also looked at userscripts and changing hostfiles, but I can't seem to find a solution just yet. Hope you can help a bit. Please ask for more information if needed! Thanks in advance.
Since you have already mentioned you are using Requestly, let me try to put up a simple solution using Requestly Redirect Rule and Wildcard operator.
Try Redirect Rule:
Source - Url Matches (WildCard) http*://*facebook.com*
Destination - http$1://mbasic.facebook.com$3
The explanation is very simple
First Match ($1) - 's' if https protocol or '' if http protocol
Second Match ($2) - Subdomain of facebook e.g. www. or any other
Third Match ($3) - Anything after facebook.com
We are interested in changing $2 value with mbasic and keep $1, $3 as it is and this is what is done in the Destination field.
Here is a screenshot for reference -
I tried it and it looked like working fine.
Disclaimer - I work at Requestly. For further questions feel free to reach out to us at contact#requestly.io or tweet at #requestlyIO

regex - Match # part of the url on the server

I'm trying to write a regex to match parts of urls and use a SEO redirection wordpress plugin to create a 301 redirect on the matching results.
if, for example, I write these URLs:
https://www.test.com/my-site
https://www.test.com/my-site/
I want to be redirect to:
https://www.test.com/your-site/
but if the urls are followed by an hash (#) like the one below:
https://www.test.com/my-site/#/..
Do not redirect.
I have played around for a bit with regExr and this is as far as I could get:
regexr.com/3scpb
But when try to implement it inside the plugin the redirect doesn't work.
What am i doing wrong here?
Is it better to do it straight inside the .htaccess file?
would it be better and more robust/reliable that way?
Thanks
The hash is never sent by the browser.
The hash is used internally by the browser to see which fragment of the document is focused on. This is called fragment identifier. This means your server will never see the # coming up. You cannot prevent this behavior.

Need help creating regex redirects for URLs with query strings attached

The issue is when using infusionsoft or another email platform, when a URL is used in an email, it adds a query string to the URL. If that URL is being redirected, it will not redirect properly with the query string attached, sending the user to a 404 page.
I am trying to figure out how to correctly create a regex expression in order to redirect the page and catch that query attached to it in order to redirect properly.
I think I've figured out how to do THAT, but then need to figure out how to exclude a URL that has the same beginning text...
For example:
If the original url is: /page-url/
And needs to redirect to /page-url-free/
So these versions need to redirect:
/page-url/page-url/
/page-url/?inf_contact_key=474a03f754bb3dadf5415b3b652fc7baa6979sf0112d3fe
But I need the regex expression to NOT catch /page-url-free/ since that would cause an infinite loop.
Any advice would be amazing. Thanks so much
If you just need to replace page-url with page-url-free, with keeping the query parameters (if provideds) exactly the same, the following regex will work:
\/(page-url)(?:\/\?.+)?$
The above regex will capture page-url which can then be easily replaced with page-url-free depending on what you are using.

IIS URL rewrite not working properly 404 errors

I am upgrading a joomla website setup on IIS 10. Now I have oldsite.com and a newsite.com. My new site has slightly different folder structure but page names and content is same. Rightly so client doesn't want to lose SEO ranking on the old pages and want to redirect them to correct one on the new upgraded site.
i need to do following
is wildcard and will get replaced with whatever will be typed in the URL in it’s place
/div-services/* will redirect to /div/*
/div-questions/* will redirect to /div/questions/*
/fm-lw-services/* will redirect to /fm-lw/*
/locations/* will redirect to /contact/*
/resources/blog/* will redirect to /blog/*
/contact-us/* will redirect to /contact/*
I initially setup my pattern to
(.*)(div-services)(.*) becomes {R:1}( div){R:1}
It worked well till I have matching phrase to repeat in some form in the url. which in this case is “div-services” coming again in the URL, it gets replaced as well.
For example if the url is newsite.com/div-services/xyz/abc-div-services then per the rule it will replace both occurrences of “div-services” which is not desired I only need to replace the first occurrence. I though it’s a easy fix and made my pattern as following
(.*)(/div-services/)(.*) replace to {R:1}(/div/){R:1}
Even though in the test pattern it validates with success but it just doesn’t work and does not re write the URL I even tried with the escape character
(.*)(\/div-services\/)(.*) becomes {R:1}(/div/){R:1}
Still no luck. After digging and digging I found following example
div-services/(.*)$ becomes div/{R:1}
this worked generally well but now if I don’t have the ending forward slash it won’t work
for example if URL is is newsite.com/div-services it won’t work but is newsite.com/div-services/ and is newsite.com/div-services/xyx will work fine.
I am just at loss, any help will be much appreciated. I just don’t understand why can’t I detect the forward slash /
fyi I figured why this was not working (.*)(/div-services/)(.*) becomes {R:1}(/div/){R:1}
it is because input start the after the first forward slash, i was assuming that it would be the whole URL that is why my regular expression validates but actually doesn't work. As when it run it is only taking URL after the first slash, that clarifies so many things and logical explanation on why many of my patterns were not working even though they would pass the pattern test. Hopefully it save others the hours and hours i wasted because i didn't have clear understand how it was working
enter image description here

Regex to look for url start value and end value

I'm using using regex to look for URL that starts with http or https and with a specific value.
^http|https\:\/\/www
This regex looks at the http/https in a URL and this works.
/[\/]\bvalue?\b[\/]/g
This regex looks for "value" in a url and this currently matches with
http://www.test.co.uk/value/
http://www.test.co.uk/folder/value/
Is there a possibility to put those two regex together? Basically I need to display URLs that doesn't contain http/https or /value/ in the URL path
You're looking to do this: /(?=^(https|http))|(\bvalue\b)/g
First half: (?=^(https|http)) which will look first for https and then for http. My personal opinion however is to reduce the code to look only for http, since by matching for http you can also match for https. You may think this behavior is not going to work, but logically it does. You can try that if you like and see what happens.
Second half: (\bvalue\b). You can be more specific such as it being between forward and back slashes, or not. I used the \b delimiter to avoid it being part of another string and it worked quite well.
The important part here is to unite them, so use the | operator and it yields the above result.
Test strings:
http://www.helloworldvalue/value/values/
https://www.helloworldvalue/values/svalue/value/value/vaaluevalue/
Try it and let me know if you have any questions in the comments below.