Using htaccess regex with range to redirect - regex

This is my .htaccess regex:
RewriteRule ^index.php/news/([0-9][0-9][0-9])?/16$ /index.php/news/$1/16 [L,R=301]
That works. But, I need the range 0-1879 to redirect to /index.php/news/$1/16 and more than 1879 to redirect to http://otherdomain/index.php/news/$1/16.

You shouldn't do this, but...
Here's how you match 0-1879:
[0-9]|[1-9][0-9]{1,2}|1[0-7][0-9]{2}|18[0-6][0-9]|187[0-9]
Here's how you match 1880-infinity
18[8-9][0-9]|19[0-9]{2}|[2-9][0-9]{3}|[1-9][0-9]{4}
Note: with the alternation, you'll need to wrap these in a group () to properly nest the logic.
Instead I recommend...
Just match 0+ digits and redirect to the other domain. If the number is 0-1879 (in your backend language, controller, etc.), then redirect them back to the original domain. Or figure out how to get the old news migrated over to the other domain, if your doing what I think you're doing.

Related

Simplifying redirect in htaccess

I have redirects:
RewriteRule ^(.*)/thema(.*)$ https://www.newurl.com [R=301,L]
RewriteRule ^(.*)/stichpunkt(.*)$ https://newurl.com [R=301,L]
RewriteRule ^(.*)/author(.*)$ https://www.newurl.com [R=301,L]
RewriteRule ^(.*)/2023(.*)$ https://www.newurl.com [R=301,L]
is there a way to simplify these into one line?
I need to disable category, tag, author and date archives in Wordpress
This should really be done in WordPress itself. Otherwise WP is still going to generate and publish these URLs (eg. Sitemap, RSS feed, etc.).
Otherwise, if .htaccess is your only option then you should serve a 404, rather than redirect to the homepage. Whilst a redirect to the homepage is likely to be treated as a soft-404 by Google (and possibly other search engines) it runs the risk of being indexed under these "archive" URLs (and accessible with a site: search).
For example, at the top of the root .htaccess file (before any existing WP directives):
# Whatever your custom 404 page is (could be WordPress)
ErrorDocument 404 /404.php
# Force a 404 for "category, tag, author and date archives
RewriteRule (^|/)(thema|author|stichpunkt|2\d{3})(/|$) - [R=404]
2\d{4} matches any 4 digit year (in the 2000's).
The regex matches any of those "words" only when they occur as a whole path segement (not partial matches).
R=404 - This is not a "redirect" (despite the use of the R flag). The 404 error document is served via an internal subrequest and a 404 HTTP response code is set on the initial response. If these URLs have previously been indexed then consider changing this to a "410 Gone" instead, ie. R=410 or simply G (shorthand flag).
You can use a simple alternation with |:
RewriteRule ^.*/(thema|author|stichpunkt|2023) https://www.newurl.com [R=301,L]
You don't need to capture parts that you don't need to refer back to, so I removed the () around the .*. Around the alternation they are still needed so even if you are not interested in capturing that part, otherwise it would not be clear where the first value starts and the last one ends.
And you don't need to match .*$ either, you can just leave of the $ that anchors this pattern at the end.

What 301 redirect would work for this string format?

I have an ecommerce store with the product URL format:
/categoryname/subcat1name/subcat2name/1450--my-widget
I will shorten it to:
/1450--my-widget
I can do the change within the ecommerce software, but I need to set up a mod rewrite redirect for the old URLs.
To avoid matching URLs for categories, content pages, etc, as well as product URLs of the new format, I need to match on all these conditions:
Does not contain the string "/info/"
Contains a slash, followed by 1 or more characters, followed by another slash, followed by 1 or more digits, followed by "--", followed by 1 or more characters
What directive would work?
EDIT:
More examples of matching and non matching strings
Matches for old product url:
/a-category/this-category/333--my-widget
/some-cat/34--widgetname
Non matches:
/1918--widgetcategory/
/info/12--about-us
/quick-order
/login?back=my-account
/2050--my-widgetname
You can use this 301 redirect rule as your very first rule in site root .htaccess:
RedirectMatch 301 ^(?!.*/info/).*/[^/]+/([0-9]+--[^/]+)/?$ /$1

negating rewrite rules for apache

I'm trying to make it so that if the ending of a URL contains anything surrounding a number (except that the first part can be any combination of numbers, a hyphen or a p), then the url is redirected with whatever surrounding the number is taken off.
Here's my regex:
RewriteRule ^all/[^p^P^0-9^\-]+([0-9]+).*$ /allof/$1 [R=301,NC,L]
If I tried these test URLs, the redirect should happen, but does not:
http://example.com/all/-a*1
http://example.com/all/plus100
If I tried this test URL, the redirect does not happen which is correct:
http://example.com/all/p1-100
If I tried these test URLs, the redirect happens, which is correct:
http://example.com/all/(100) - redirects to http://example.com/allof/100
http://example.com/all/minus100 - redirects to http://example.com/allof/100
Perhaps my regex is faulty. I tried removing the extra carets in the square brackets except for the first, and that didn't help, and I don't want to replace the square brackets with only a .* since I then won't be able to capture the number. What could I be doing wrong?
You can use negative lookahead in your rule:
RewriteRule ^all/(?!p\d*-)\D*(\d+) /allof/$1 [R=301,NC,L]
RegEx Demo

Regular Expression to capture URLs with ascii encoded characters

Having migrated a Wordpress site to a new build, I need to capture a lot of old URLs and redirect them to the same content on the new site. The problem is that the old site has a lot of URLs with ascii-encoded chars and Wordpress has stripped them out on the current site. For example:
/blog/uncategorized/germany%E2%80%99s-ageing-population-working-longer-working-better.html
would redirect to:
/blog/germanys-ageing-population-working-longer-working-better/
Can anyone provide a regular expression that would remove the ascii-encoded characters?
For matching the encoded characters, you would use the following regex pattern:
%[A-Z0-9]{2}
How you perform the replacement will depend on the language/tool you are using.
You have to match against the request here, because with redirect and rewrite rules, the URI is decoded before the patterns get applied. That means you'd be matching against stuff like รข instead of the encoded strings. So you'll want something like:
RewriteEngine On
RewriteCond %{THE_REQUEST} \ /blog/([^\?\ ]*)\%[A-Z0-9]{2}([^\?\ ]*)
RewriteRule ^ /blog/%1%2 [L,R=301,NE]

How to rewrite this URL to a redirect page?

I am using Microsoft-IIS/7.5 on a hosted server (Hostek.com)
I have an existing site with 2,820 indexed links in Google. You can see the results by searching Google with this: site:flyingpiston.com Most of the pages use a section, makerid, or bikeid to get the right information. Most of the links look like this:
flyingpiston.com/?BikeID=1068
flyingpiston.com/?MakerID=1441
flyingpiston.com/?Section=Maker&MakerID=1441
flyingpiston.com/?Section=Bike&BikeID=1234
On the new site, I am doing URL rewriting using .htaccess. The new URLs will look like this:
flyingpiston.com/bike/1068/
flyingpiston.com/maker/1123/
Basically, I just want to use my htaccess file to direct any request with a "?" question mark in it directly a coldfusion page called redirect.cfm. On this page, I will use ColdFusion to write a custom 301 redirect. Here's what ColdFusion's redirect looks like:
<cfheader statuscode="301" statustext="Moved Permanently">
<cfheader name="Location" value="http://www.newurl/bike/1233/">
<cfabort>
So, what does my htaccess file need to look like if I want to push everything with a question mark to a particular page? Here's what I have tried, but it's not working.
RewriteEngine on
RewriteRule ^? /redirect.cfm [NS,L]
Update. Using the advice from below, I am using this rule:
RewriteRule \? /redirect/redirect.cfm [NS,L]
To try to push this request
http://flyingpiston2012-com.securec37.ezhostingserver.com/?bikeid=1235
To this page:
http://flyingpiston2012-com.securec37.ezhostingserver.com/redirect/redirect.cfm
There's a couple of reasons what you're trying isn't working.
The first one is that RewriteRule uses a regex, and ? is a regex metacharacter, which therefore needs be escaped with a backslash (\?) to tell it to match the literal question mark character.
However, the second part of the problem is that the regex for RewriteRule is only tested against the filename part of the URL - it specifically excludes the query string.
In order to match against the query string you need to use the RewriteCond directive, placed on the line before the rule (but applied in between the RewriteRule matching and replacing), acting as an additional filter. The useful bit is that you can specify which part of the URL to match against (as well as having the option for using non-regex tests).
Bearing all this in mind, the simplest way to match/rewrite a request with a query string is:
RewriteCond %{QUERY_STRING} .
RewriteRule .* /redirect/redirect.cfm
The %{QUERY_STRING} is what the regex is tested against (everything in CF's CGI scope can be used here, and some other stuff too - see the Server Variables box in the docs).
The single . just says "make sure the matched item has any single character"
At the moment, this rule will preserve the existing query string - if you want to discard it, you can place a ? onto the end of the replacement URL. (If you need to use a query string on the URL and not discard the old version, use the [QSA] flag.)
In the opposite direction, you're losing the filename part of the URL - to preserve this, you probably want to append it onto the replacement as PATH_INFO, using the automatic whole-match capture $0.
These two things together provides:
RewriteCond %{QUERY_STRING} .
RewriteRule .* /redirect/redirect.cfm/$0?
One final thing is that you'll want to guard against infinite loops - the above rule strips the query string so it will always fail the RewriteCond, but better to be safe (especially if you might need to add a query string), which you can do with an extra RewriteCond:
RewriteCond %{QUERY_STRING} .
RewriteCond %{REQUEST_URI} !/redirect/redirect\.cfm
RewriteRule .* /redirect/redirect.cfm/$0?
Multiple RewriteCond are combined as ANDs, and the ! negates the match.
You can of course add whatever flags are required to the RewriteRule to have it behave as desired.