301 Redirects using RegEx - regex

I'm not great with RegEx. I have an Ecommerce site moving from PD Shop to Woocommerce. I need to write 301s for the pages on the old site to redirect to its corresponding page on the new site. The problem is the url structure for site A is completely different than it is for Site B. Rather than doing it manually for thousands of products, I wanted to use RegEx, but I'm not even sure it can be done.
If anyone has any insight on how to pull this off, I'd really appreciate the help. I'd prefer not to do it one link at a time, but I can't see how.
Old links are structured like this:
www.domain.com/shop/item.aspx/item-name/id/
Examples:
www.domain.com/shop/item.aspx/sierra-saw/58/
www.domain.com/shop/item.aspx/duffle-bag-double-strap-olive/2206/
www.domain.com/shop/item.aspx/duffle-bag-side-zipper-black/2207/
New links are structured like this:
www.domain.com/product/item-name/
Examples:
www.domain.com/product/sierra-saw/
www.domain.com/product/double-strap-duffle-bag/
www.domain.com/product/double-strap-duffle-bag/

You should match www.domain.com/shop/item.aspx/([^/]+)/.* and replace it with www.domain.com/product/\1/.
The matching pattern matches url starting with the common root (www.domain.com/shop/item.aspx/), groups their next path fragment (everything up to the next slash) and match the rest of the line.
The replace fragment just repeats the grouped path fragment next to the new common root.

Related

Regex for multiple URLs without clear pattern

I'm quite new to using regex so I hope there's someone who can help me out. I want to set up an event on Google Tag Manager through RegEx that fires whenever someone views a page. I'm trying to do this using the Page URL as a parameter so that the event hits, when that URL is visited. Its for around 1400 urls that are in the same sub-folder but have a different page name. For example: https://www.example.com/products/product-name-1, https://www.example.com/products/product-name-2
What would be the best way to group these into one RegEx formula?
I've tried to separate all urls by using the '|' sign without any result. I've also tried this format, without any luck: (^/page-url-1/$|^/page-url-1/$|^/page-url-1/$|^/page-url-1/$)
A couple things are happening with your attempt. First, you aren't escaping the '/'. This is a reserved or special character and you will need to precede it with a \ to tell the engine that you want that specific character. It would look like this:
\/products\/page-url-1
I am assuming you are using a {{Page Path}} so the above would match for any paths that contain /products/page-url-1.
If you want the event to fire on all pages within the /products directory, there is an easier way of doing this.
\/products\/.*
what this will do is match any pages within your /products directory. If you have a landing page on /products, this will be omitted from the firing. The '.' means it will then match any character after the / and '*' means it can do this unlimited times.
EDIT:
Since you aren't looking for all the products pages, you can you a matching group and list them all. I suspect that all the product names will be different enough and not share any common path elements so you will have to list out the ones want.
\/products\/(product-url-1|product-url-2|product-url-3).*

Append UTM tracking to URLs

I'm trying to set a custom rule within the catalog settings in Facebook's business manager, I want to append UTM tracking to my product URLs and cannot do this from a find and replace regex function. A unique identifier of my product URLs is that they always end with a number 0-9. I'm new to regex and can't figure out how to do this, example below for reference:
Existing product URL:
https://www.example.com/product/12345
https://www.example.com/product/54321
Appended UTM tracking:
https://www.example.com/product/12345?utm_source=askjeeves&utm_medium=cpm
https://www.example.com/product/54321?utm_source=askjeeves&utm_medium=cpm
Any help on how to write a find and replace regular expression to help me append tracking to help me achieve similar to my example above would be much appreciated!
Image below from where I am trying to input this rule:
screen grab in FB business manager catalog custom rule settings
You can do this on your .htaccess. Try the following code into your .htaccess file (Create one if you don't have already) and give me a feedback please.
RewriteEngine On
RewriteRule product/([0-9]+) product/$1?utm_source=askjeeves&utm_medium=cpm [L]
With this code you'll navigate to www.example.com/product/12345 but the system will see full URL which is www.example.com/product/12345?utm_source=askjeeves&utm_medium=cpm
I don't know facebook's find and replace function, but generally a regex should look like this:
(^http.*$)
Then replace with:
$1?utm_source=askjeeves&utm_medium=cpm
You can also try with:
\1?utm_source=askjeeves&utm_medium=cpm
If facebook follows 'normal' regexes, this should Work.
Edit: try these things too, it might Work.

regex - Match # part of the url on the server

I'm trying to write a regex to match parts of urls and use a SEO redirection wordpress plugin to create a 301 redirect on the matching results.
if, for example, I write these URLs:
https://www.test.com/my-site
https://www.test.com/my-site/
I want to be redirect to:
https://www.test.com/your-site/
but if the urls are followed by an hash (#) like the one below:
https://www.test.com/my-site/#/..
Do not redirect.
I have played around for a bit with regExr and this is as far as I could get:
regexr.com/3scpb
But when try to implement it inside the plugin the redirect doesn't work.
What am i doing wrong here?
Is it better to do it straight inside the .htaccess file?
would it be better and more robust/reliable that way?
Thanks
The hash is never sent by the browser.
The hash is used internally by the browser to see which fragment of the document is focused on. This is called fragment identifier. This means your server will never see the # coming up. You cannot prevent this behavior.

IIS URL rewrite not working properly 404 errors

I am upgrading a joomla website setup on IIS 10. Now I have oldsite.com and a newsite.com. My new site has slightly different folder structure but page names and content is same. Rightly so client doesn't want to lose SEO ranking on the old pages and want to redirect them to correct one on the new upgraded site.
i need to do following
is wildcard and will get replaced with whatever will be typed in the URL in it’s place
/div-services/* will redirect to /div/*
/div-questions/* will redirect to /div/questions/*
/fm-lw-services/* will redirect to /fm-lw/*
/locations/* will redirect to /contact/*
/resources/blog/* will redirect to /blog/*
/contact-us/* will redirect to /contact/*
I initially setup my pattern to
(.*)(div-services)(.*) becomes {R:1}( div){R:1}
It worked well till I have matching phrase to repeat in some form in the url. which in this case is “div-services” coming again in the URL, it gets replaced as well.
For example if the url is newsite.com/div-services/xyz/abc-div-services then per the rule it will replace both occurrences of “div-services” which is not desired I only need to replace the first occurrence. I though it’s a easy fix and made my pattern as following
(.*)(/div-services/)(.*) replace to {R:1}(/div/){R:1}
Even though in the test pattern it validates with success but it just doesn’t work and does not re write the URL I even tried with the escape character
(.*)(\/div-services\/)(.*) becomes {R:1}(/div/){R:1}
Still no luck. After digging and digging I found following example
div-services/(.*)$ becomes div/{R:1}
this worked generally well but now if I don’t have the ending forward slash it won’t work
for example if URL is is newsite.com/div-services it won’t work but is newsite.com/div-services/ and is newsite.com/div-services/xyx will work fine.
I am just at loss, any help will be much appreciated. I just don’t understand why can’t I detect the forward slash /
fyi I figured why this was not working (.*)(/div-services/)(.*) becomes {R:1}(/div/){R:1}
it is because input start the after the first forward slash, i was assuming that it would be the whole URL that is why my regular expression validates but actually doesn't work. As when it run it is only taking URL after the first slash, that clarifies so many things and logical explanation on why many of my patterns were not working even though they would pass the pattern test. Hopefully it save others the hours and hours i wasted because i didn't have clear understand how it was working
enter image description here

regex rewrite url cluster

I've been trying to learn regex and its terribly complicated. I'm not even positive that it's possible to rewrite these URLs without doing them individually. I can do them individually (search & replace) but there are a few different clusters and there are 1000's of URLs (migration).
This is a Joomla site running acesef software. Here is an example URL from 1 particular cluster. The end of the URL is identical for old and new URL. Only the beginning directories have changed. So is there a way to match the end of the URL for all URLs in those particular directories from old to new and rewrite it with a single expression?
Old URL = www.domain.com/property-details/condominiums/3448-page-title
New URL = www.domain.com/bangkok/condos/rent/3448-page-title
I won't even bother posting what I've tried to write so far, because its so far off. I'm trying to get my feet wet with regex but this is a pretty complicated rewrite for a beginner.
Well uh, at face value you could just use this:
[^/]+$
This will give you anything after the last / so in your example, you'd get 3448-page-title