Simplifying redirect in htaccess - regex

I have redirects:
RewriteRule ^(.*)/thema(.*)$ https://www.newurl.com [R=301,L]
RewriteRule ^(.*)/stichpunkt(.*)$ https://newurl.com [R=301,L]
RewriteRule ^(.*)/author(.*)$ https://www.newurl.com [R=301,L]
RewriteRule ^(.*)/2023(.*)$ https://www.newurl.com [R=301,L]
is there a way to simplify these into one line?

I need to disable category, tag, author and date archives in Wordpress
This should really be done in WordPress itself. Otherwise WP is still going to generate and publish these URLs (eg. Sitemap, RSS feed, etc.).
Otherwise, if .htaccess is your only option then you should serve a 404, rather than redirect to the homepage. Whilst a redirect to the homepage is likely to be treated as a soft-404 by Google (and possibly other search engines) it runs the risk of being indexed under these "archive" URLs (and accessible with a site: search).
For example, at the top of the root .htaccess file (before any existing WP directives):
# Whatever your custom 404 page is (could be WordPress)
ErrorDocument 404 /404.php
# Force a 404 for "category, tag, author and date archives
RewriteRule (^|/)(thema|author|stichpunkt|2\d{3})(/|$) - [R=404]
2\d{4} matches any 4 digit year (in the 2000's).
The regex matches any of those "words" only when they occur as a whole path segement (not partial matches).
R=404 - This is not a "redirect" (despite the use of the R flag). The 404 error document is served via an internal subrequest and a 404 HTTP response code is set on the initial response. If these URLs have previously been indexed then consider changing this to a "410 Gone" instead, ie. R=410 or simply G (shorthand flag).

You can use a simple alternation with |:
RewriteRule ^.*/(thema|author|stichpunkt|2023) https://www.newurl.com [R=301,L]
You don't need to capture parts that you don't need to refer back to, so I removed the () around the .*. Around the alternation they are still needed so even if you are not interested in capturing that part, otherwise it would not be clear where the first value starts and the last one ends.
And you don't need to match .*$ either, you can just leave of the $ that anchors this pattern at the end.

Related

HTACCESS : Redirect (301) thousands of Url's containing directories to simple url's

I need to convert with HTACCESS method tons of URL's allready produced (and already indexed...) by Wordpress for articles containing folders/subfolders to simples URL's without any folder/subfolder name.
Example:
FROM https://www.website.com/Animals/Cats/mycat.html TO https://www.website.com/mycat.html
FROM https://www.website.com/Animals/Dogs/mydog.html TO https://www.website.com/mydog.html
FROM https://www.website.com/Countries/France/bordeaux.html TO https://www.website.com/bordeaux.html
etc...
I already changed permalinks options in Wordpress config. So, now URL's produced are in the good format (Ex: https://www.website.com/bordeaux.html) without any folder name.
My problem is to redirect all OLD Url's to this new format to prevent 404 and preserve the rank.
If tryed to add in my .htacess this line :
RewriteRule ^/(.*)\.html$ /$1 [R=301,L,NC]
I egally tryed RedirectMatch 301 (.*)\.html$ method and it's the same. I'm going crazy with this.
What am i doing wrong and could you help me?
Thanks
RewriteRule ^/(.*)\.html$ /$1 [R=301,L,NC]
The URL-path matched by the RewriteRule pattern never starts with a slash. But you can use this to only match the last path-segment when there are more "folders". And the target URL also needs to end in .html (as per your examples).
So, this can be written in a single directive:
RewriteRule /([^/]+\.html)$ /$1 [R=301,L]
This handles any level of nested "folders". But does not match /foo.html (the target URL) in the document root (due to the slash prefix on the regex), so no redirect-loop.
(No need for any preceding conditions.)
Here the $1 backrefence includes the .html suffix.
Just match the last part of the url and pass it to the redirect:
RewriteRule /([^/]+)\.html$ /$1.html [R=301,L,NC]
It will match any number of directories like:
https://www.website.com/dir1/page1.html
https://www.website.com/dir1/dir2/page2.html
https://www.website.com/dir1/dir2/dir3/page3.html
https://www.website.com/dir1/dir2/dir3/dir3/page4.html

Having trouble with some redirect/rewrite codes .. 301 and 410s

Brand new here and totally NOT a coder, so be gentle. The level of understanding I have here is about that of a toddler, so pretend you're talking to a 5 year old and I should be able to keep up.
I'm switching to a new server, and no longer using coppermine gallery. I can't get the redirects from the old cpg galleries and images to work.
For albums and categories that I will not redirect
(The url I want to redirect here would be)
http://www.example.com/stock/thumbnails\.php\?album=62
They're gone and no longer exist, I wrote the 410 rule as
RewriteRule ^stock/thumbnails\.php\?album=62$ - [R=410, L]
That breaks the new site and creates a 500 error
For old albums that I want to redirect to a new url, such as this url
http://www.example.com/stock/thumbnails\.php\?album=3
I wrote
RewriteRule ^stock/thumbnails\.php\?album=36$ https://www.example.com/gallery/appalachian-trail-photos/ [R=301,L]
But it does nothing. The urls show as 404 pages. I also tried it as
Redirect 301 /stock/index.php?cat=2 https://www.example.com/gallery/outdoor-recreation-photos/
which also does nothing.
I also want to redirect any image display pages to the root gallery of their relative album. So an image from a particular cpg album would go to the gallery page for that on the new site.
A url like this
http://www.example.com/stock/displayimage.php?album=1&pid=4563#top_display_media
I wrote the redirect as
RewriteRule ^stock/displayimage\.php\?album=1&.*$ https://www.example.com/gallery/canadian-wildlife-photos/ [R=301,L]
RewriteRule ^stock/displayimage\.php\?album=32&.*$ - [R=410,L]
to send any image from album 1 to the CA wildlife photos album and any image from album 32 is gone.
None of these last work but I can't see what's wrong with them?
Any help would be superduper appreciated, thanks. Apologies in advance for my ignorance.
RewriteRule ^stock/thumbnails\.php\?album=62$ - [R=410, L]
^--ERROR HERE
The erroneous space in the flags argument is likely the cause of the 500 error. This is a syntax error, so just "breaks". It should be [R=410,L] (no space). The L flag is actually redundant here when you specify a non-3xx status code, so this is the same as simply [R=410], which can be further simplified to just [G]. The G flag is shorthand for R=410 (ie. "410 Gone").
However, the RewriteRule pattern (first argument) matches against the URL-path only. This does not match against the query string (the part of the URL after the first ?). To match the query string you need a separate condition (RewriteCond directive) and match against the QUERY_STRING server variable (which does not include the ? prefix).
For example, the above should be written something like:
RewriteCond %{QUERY_STRING} ^album=62$
RewriteRule ^stock/thumbnails\.php$ - [G]
This matches the URL /stock/thumbnails.php?album=62 exactly. And serves a "410 Gone". If there are any other URL parameters (in the query string) then the match will fail. eg. /stock/thumbnails.php?foo=1&album=62 will not match.
To match the URL parameter album=62 anywhere in the query string (if there are other URL params) then tweak the CondPattern (2nd argument to the RewriteCond directive) like so:
RewriteCond %{QUERY_STRING} (^|&)album=62(&|$)
:
The additional alternation subgroups (^|&) and (&|$) before and after the URL parameter, ensure we only match that exact URL parameter and not fooalbum=62 or album=623, etc.
RewriteRule ^stock/displayimage.php?album=1&.*$ https://www.example.com/gallery/canadian-wildlife-photos/ [R=301,L]
The above points should resolve the main issue with this rule (matching the query string). However, you also need to add the QSD flag in order to remove the original query string from the redirect response. Otherwise, the original query string (ie. album=1&....) will be appended to the end of target URL (eg. /gallery/canadian-wildlife-photos/?album=1&...).
For example:
RewriteCond %{QUERY_STRING} (^|&)album=1(&|$)
RewriteRule ^stock/displayimage\.php$ https://www.example.com/gallery/canadian-wildlife-photos/ [QSD,R=301,L]
Redirect 301 /stock/index.php?cat=2 https://www.example.com/gallery/outdoor-recreation-photos/
Note that the (mod_alias) Redirect directive does not match the query string either (only the URL-path). To match the query string you need to use mod_rewrite (RewriteRule and RewriteCond) as mentioned above.

How to set up rewrite rule for a list of keywords in the URL?

What I wish to do
I have a number of URLs I need to redirect, along with a 301 permanent redirect header being sent to browser. I've determined doing this at the htaccess level is most efficient (as opposed to doing it with a function in the Wordpress site this relates to).
The URLs to redirect are:
https://www.mydomain.com.au/search-result/?location=victoria
https://www.mydomain.com.au/search-result/?location=new-south-wales
https://www.mydomain.com.au/search-result/?location=queensland
https://www.mydomain.com.au/search-result/?location=south-australia
https://www.mydomain.com.au/search-result/?location=tasmania
https://www.mydomain.com.au/search-result/?location=northern-territory
Where to redirect to
I want to redirect them to the home page: https://mydomain.com.au/ (I might later choose to redirect them all elsewhere, but I can do that part).
NOTE: The query string should be dropped from the redirect.
I am not sure whether it's best to test for all six of those location= variables, or to simply test for the one location= variable that is not to redirect.
The one location= variable that is not to redirect is ?location=western-australia. E.g.,
https://www.mydomain.com.au/search-result/?location=western-australia
Additional considerations
Note that there are other .../search-result/ URLs that have different variables in the query strings, such as ?weather=... or ?water=.... For example, https://www.mydomain.com.au/search-result/?location=victoria&weather=part-shade&water=&pasture=
As seen in that example, it's also possible multiple variables will be in the query string, such as ?location=tasmania&weather=&water=moderate&pasture=.
So I need to test for the presence of the above listed location= irrespective of whether or not it has other variables after it. The location= variable is always the first in the overall query string.
I am thinking it may be as simple as testing for the presence of /search-result/ AND that followed by victoria | tasmania | northern-territory | etc. in the URL. I can't be 100% sure those words (victoria, etc.) won't show up in any other URLs, hence my reason for only redirecting if those words follow either location= or /search-result/. I suspect location= would be a suitable condition.
I've played around with modifying many rewrite rule examples I've found online, and couldn't get anything to work. I'd either get a 501 error (site crash), or nothing would happen at all.
Thank you.
Not sure if you've tried these, but they worked well for me:
To allow any location values, except western-australia:
# The request path is /search-result/ or maybe /search-result
RewriteCond %{REQUEST_URI} ^/search-result/?$
# ..and the query string 'location' is not empty
RewriteCond %{QUERY_STRING} (^|&)location=.+($|&)
# ..and the value is not 'western-australia'.
RewriteCond %{QUERY_STRING} !(^|&)location=western-australia($|&)
# Redirect to the home page.
RewriteRule . / [R=301,NC,L]
To allow only certain location values:
RewriteCond %{REQUEST_URI} ^/search-result/?$
// Allow only certain location values - (<value>|<value>|...).
RewriteCond %{QUERY_STRING} (^|&)location=(victoria|new-south-wales)($|&)
RewriteRule . / [R=301,NC,L]
And note that, in WordPress, you need to put the above before the WordPress rules:
# This is a sample .htaccess file used on a WordPress site.
# PLACE YOUR CUSTOM RULES HERE.
# BEGIN WordPress
<IfModule mod_rewrite.c>
# ...
RewriteRule . /index.php [L]
</IfModule>
# END WordPress
I.e. Place your rules above the # BEGIN WordPress line, to avoid getting 404 errors.
And btw, I'm no htaccess expert, but hopefully this answer helps you. :)

RewriteRule to remove superfluous single "?" in URL

I am using IBM HTTP server configuration file to rewrite a URL redirected from CDN.
For some reason the URL comes with a superfluous single question mark even when there are no any query string. For example:
/index.html?
I'm in the process of making the 301 redirect for this. I want to remove the single "?" from the url but keep it if there is any query string.
Here's what I tried but it doesn't work:
RewriteRule ^/index.html? http://localhost/index.html [L,R=301]
update:
I tried this rule with correct regular expression but it never be triggered either.
RewriteRule ^/index.html\?$ http://localhost/index.html [L,R=301]
I tried to write another rule to rewrite "index.html" to "test.html" and I input "index.html?" in browser, it redirected me to "test.html?" but not "index.html".
You need to use a trick since RewriteRule implicitly matches against just the path component of the URL. The trick is looking at the unparsed original request line:
RewriteEngine ON
# literal ? followed by un-encoded space.
RewriteCond %{THE_REQUEST} "\? "
# Ironically the ? here means drop any query string.
RewriteRule ^/index.html /index.html? [R=301]
Question-mark is a Regular Expression special character, which means "the preceding character is optional". Your rule is actually matching index.htm or index.html.
Instead, try putting the question-mark in a "character class". This seems to be working for me:
RewriteRule ^/index.html[?]$ http://localhost/index.html [L,R=301]
($ to signify end-of-string, like ^ signifies start-of-string)
See http://publib.boulder.ibm.com/httpserv/manual60/mod/mod_rewrite.html (for your version of Apache, which is not the latest)
Note from our earlier attempts, escaping the question-mark doesn't seem to work.
Also, I'd push the CDN on why that question-mark is being sent. This doesn't seem a normal pattern.

How to rewrite this URL to a redirect page?

I am using Microsoft-IIS/7.5 on a hosted server (Hostek.com)
I have an existing site with 2,820 indexed links in Google. You can see the results by searching Google with this: site:flyingpiston.com Most of the pages use a section, makerid, or bikeid to get the right information. Most of the links look like this:
flyingpiston.com/?BikeID=1068
flyingpiston.com/?MakerID=1441
flyingpiston.com/?Section=Maker&MakerID=1441
flyingpiston.com/?Section=Bike&BikeID=1234
On the new site, I am doing URL rewriting using .htaccess. The new URLs will look like this:
flyingpiston.com/bike/1068/
flyingpiston.com/maker/1123/
Basically, I just want to use my htaccess file to direct any request with a "?" question mark in it directly a coldfusion page called redirect.cfm. On this page, I will use ColdFusion to write a custom 301 redirect. Here's what ColdFusion's redirect looks like:
<cfheader statuscode="301" statustext="Moved Permanently">
<cfheader name="Location" value="http://www.newurl/bike/1233/">
<cfabort>
So, what does my htaccess file need to look like if I want to push everything with a question mark to a particular page? Here's what I have tried, but it's not working.
RewriteEngine on
RewriteRule ^? /redirect.cfm [NS,L]
Update. Using the advice from below, I am using this rule:
RewriteRule \? /redirect/redirect.cfm [NS,L]
To try to push this request
http://flyingpiston2012-com.securec37.ezhostingserver.com/?bikeid=1235
To this page:
http://flyingpiston2012-com.securec37.ezhostingserver.com/redirect/redirect.cfm
There's a couple of reasons what you're trying isn't working.
The first one is that RewriteRule uses a regex, and ? is a regex metacharacter, which therefore needs be escaped with a backslash (\?) to tell it to match the literal question mark character.
However, the second part of the problem is that the regex for RewriteRule is only tested against the filename part of the URL - it specifically excludes the query string.
In order to match against the query string you need to use the RewriteCond directive, placed on the line before the rule (but applied in between the RewriteRule matching and replacing), acting as an additional filter. The useful bit is that you can specify which part of the URL to match against (as well as having the option for using non-regex tests).
Bearing all this in mind, the simplest way to match/rewrite a request with a query string is:
RewriteCond %{QUERY_STRING} .
RewriteRule .* /redirect/redirect.cfm
The %{QUERY_STRING} is what the regex is tested against (everything in CF's CGI scope can be used here, and some other stuff too - see the Server Variables box in the docs).
The single . just says "make sure the matched item has any single character"
At the moment, this rule will preserve the existing query string - if you want to discard it, you can place a ? onto the end of the replacement URL. (If you need to use a query string on the URL and not discard the old version, use the [QSA] flag.)
In the opposite direction, you're losing the filename part of the URL - to preserve this, you probably want to append it onto the replacement as PATH_INFO, using the automatic whole-match capture $0.
These two things together provides:
RewriteCond %{QUERY_STRING} .
RewriteRule .* /redirect/redirect.cfm/$0?
One final thing is that you'll want to guard against infinite loops - the above rule strips the query string so it will always fail the RewriteCond, but better to be safe (especially if you might need to add a query string), which you can do with an extra RewriteCond:
RewriteCond %{QUERY_STRING} .
RewriteCond %{REQUEST_URI} !/redirect/redirect\.cfm
RewriteRule .* /redirect/redirect.cfm/$0?
Multiple RewriteCond are combined as ANDs, and the ! negates the match.
You can of course add whatever flags are required to the RewriteRule to have it behave as desired.