HTACCESS : Redirect (301) thousands of Url's containing directories to simple url's - regex

I need to convert with HTACCESS method tons of URL's allready produced (and already indexed...) by Wordpress for articles containing folders/subfolders to simples URL's without any folder/subfolder name.
Example:
FROM https://www.website.com/Animals/Cats/mycat.html TO https://www.website.com/mycat.html
FROM https://www.website.com/Animals/Dogs/mydog.html TO https://www.website.com/mydog.html
FROM https://www.website.com/Countries/France/bordeaux.html TO https://www.website.com/bordeaux.html
etc...
I already changed permalinks options in Wordpress config. So, now URL's produced are in the good format (Ex: https://www.website.com/bordeaux.html) without any folder name.
My problem is to redirect all OLD Url's to this new format to prevent 404 and preserve the rank.
If tryed to add in my .htacess this line :
RewriteRule ^/(.*)\.html$ /$1 [R=301,L,NC]
I egally tryed RedirectMatch 301 (.*)\.html$ method and it's the same. I'm going crazy with this.
What am i doing wrong and could you help me?
Thanks

RewriteRule ^/(.*)\.html$ /$1 [R=301,L,NC]
The URL-path matched by the RewriteRule pattern never starts with a slash. But you can use this to only match the last path-segment when there are more "folders". And the target URL also needs to end in .html (as per your examples).
So, this can be written in a single directive:
RewriteRule /([^/]+\.html)$ /$1 [R=301,L]
This handles any level of nested "folders". But does not match /foo.html (the target URL) in the document root (due to the slash prefix on the regex), so no redirect-loop.
(No need for any preceding conditions.)
Here the $1 backrefence includes the .html suffix.

Just match the last part of the url and pass it to the redirect:
RewriteRule /([^/]+)\.html$ /$1.html [R=301,L,NC]
It will match any number of directories like:
https://www.website.com/dir1/page1.html
https://www.website.com/dir1/dir2/page2.html
https://www.website.com/dir1/dir2/dir3/page3.html
https://www.website.com/dir1/dir2/dir3/dir3/page4.html

Related

How can I redirect users from a url to another using regex and htaccess?

I want to redirect users from www.example.com/ANYTHING to www.example.com.
Is .htaccess the better way to do it? How can I do it?
If you really just have to remove everything after www.mydomain.com, then you just have to delete everything from the first / to the end of the URL:
Search for this regex
%^([^/]*).*$
and substitute with \1, as I did here. Note that I have used the % sign as delimiter instead of /, so I don't need to escape the / in the regex. (I could have used any other available symbol other than /.)
You'll need to use a mod_rewrite RewriteRule (as opposed to a mod-alias RedirectMatch) in order to avoid conflicts with mod_dir and the DirectoryIndex*1.
For example, in .htaccess (Apache 2.2 and 2.4):
RewriteEngine On
RewriteRule . / [R,L]
The single dot matches something (ie. not the document root) and we redirect to the document root.
However, if the document root is an HTML webpage that links to resources like JavaScript, CSS and images then you need to make exceptions for these resources, otherwise these too will be redirected to the root!
For example:
RewriteEngine On
RewriteCond %{REQUEST_URI} !\.(js|css|jpg|png|gif)$ [NC]
RewriteRule . / [R,L]
*1 A mod_alias RedirectMatch directive such as RedirectMatch /. / ends up matching the rewritten request (by mod_dir) to the DirectoryIndex (eg. index.php) resulting in a redirect loop.

htaccess regex to find image and image number

I have such a url:
/keyword1/keyword2/slugged-title-8286-1.jpg?wx=292&hx=164
I would like to forward in this case to:
/images/8286-1.jpg?wx=292&hx=164
the listing number (here 8286) can be 4 or 5 digits and could perhaps contain letters. Also the parameters after ? could be different.
Could you please help me to get this solved?
I haven't done a lot with regex and not sure how this can be done.
You can use this rule in your site root .htaccess:
RewriteEngine On
RewriteRule -(\w+(?:-\d+)?\.jpe?g)$ /images/$1 [L,NE,R=302]
If you don't want a full redirect then use:
RewriteRule -(\w+(?:-\d+)?\.jpe?g)$ /images/$1 [L]
QUERY_STRING is automatically carried over to target URL.

Apache rewrite remove .jpg.html from URL

I've been trying to write a rewrite rule for apache to switch my gallery2 URLs to the gallery3 URL format:
Old url example linked: http://domain.com/gallery/photoalbumxyz/photo.jpg.html
New url example needed: http://domain.com/photos/photoalbumxyz/photo
Note that in the URL example above, "/photoalbumxyz/photo.jpg.html" is not an actual physical directory, it is just the way gallery2 rewrote "friendly" URLs. I can rewrite the /gallery/ to /photos/ by using a rule like the following:
RewriteRule ^(.*)$ /photos/$1 [QSA,L,R=301]
However I'm having trouble figuring out the matching and removal of the ".jpg.html" extension if it exists in combination with the /gallery/ -> /photos/ rewrite. The regex matching I believe is going to be .jpg.html to escape the periods, but how do I write rules to remove the ".jpg.html" extension and rewrite the directory?
RewriteRule ^\.jpg\.html$ $1 [NC]
RewriteRule ^(.*)$ /photos/$1 [QSA,L,R=301]
Edit:
Sorry! I neglected earlier to mention the album URL formats can change (doesn't have to specify a photo, and can include sub albums), I've added some specific examples:
The url rewrite rule also needs to follow:
old: http://example.com/gallery
new: http://example.com/photos
old: http://example.com/gallery/album
new: http://example.com/photos/album
old: http://example.com/gallery/album/subalbum/
new: http://example.com/photos/album/subalbum/
old: http://example.com/gallery/album/subalbum/photo.jpg.html
new: http://example.com/photos/album/subalbum/photo
Try:
RedirectMatch 301 ^/gallery(.*?)(\.(jpe?g|gif|png)\.html)?$ /photos$1
Or alternatively using mod_rewrite:
RewriteRule ^/?gallery/([^/]+)/([^.]+)\.(jpe?g|gif|png)\.html$ /photos/$1/$2 [L,R=301]
You don't need the QSA flag as query strings will automatically get appended if you don't have a ? in your rule's target.

.htaccess Subdirectory Rewrite Exception

I'm currently consolidating posts on a site we recently acquired that had multiple WordPress installs to manage content, one in the public_html folder and another in a subdirectory, like so:
1. http://domain.com/
2. http://domain.com/another-install/
We're moving all of the content from /another-install/ into the main setup, and using a 301 redirect to remove /another-install/ from all old links like so:
RedirectMatch 301 ^/another-install/(.*) http://domain.com/$1
Resulting in all articles redirecting like so:
http://domain.com/another-install/article-name/
TO
http://domain.com/article-name/
The problem is, we want to keep /another-install/ viewable as a page. With the current redirect, http://domain.com/another-install/ goes to http://domain.com/. Is there any way to add an exception, or rewrite the current rule so that it keeps /another-install/ viewable?
Change your regex from (.*) (which matches 0 or more of any character) to (.+) (which matches 1 or more of any character). That means there would have to be something following /another-install/ in order for there to be a redirect.
You need a RewriteRule to specify exclusions. Add this to your .htaccess file
RewriteCond %{REQUEST_URI} !^/old-install/(index\.wml)?$ [NC]
RewriteRule ^old-install/(.+)$ http://domain.com/$1 [R=301,NC,L]

Mod Rewrite Regex - Match a url containting A but not B

I'm trying to send page requested through a front controller.
I need to match the following urls
domain.com/admin/
domain.com/admin/somepage
but I also have some assets in the /admin subfolder that I DONT want to match
domain.com/assets
domain.com/assets/mything.css
domain.com/assets/mything.css
domain.com/xml/myxml.xml
I have written the following rule (inside .htaccess in the root of the site and that works for all 5 example URLs. How do I get it to match the top two ONLY?)
RewriteBase /
# index.php is the front controller
RewriteRule ^(admin) admin/index.php [L,QSA]
There are two ways I can see of doing this (I'd like to know how to do it using the folder 'assets' and 'xml' as an exclusion AND by setting extensions)
I'd use a negative RewriteCond before the final routing rule along these lines:
RewriteCond %{REQUEST_URI} !assets