Apache rewrite remove .jpg.html from URL - regex

I've been trying to write a rewrite rule for apache to switch my gallery2 URLs to the gallery3 URL format:
Old url example linked: http://domain.com/gallery/photoalbumxyz/photo.jpg.html
New url example needed: http://domain.com/photos/photoalbumxyz/photo
Note that in the URL example above, "/photoalbumxyz/photo.jpg.html" is not an actual physical directory, it is just the way gallery2 rewrote "friendly" URLs. I can rewrite the /gallery/ to /photos/ by using a rule like the following:
RewriteRule ^(.*)$ /photos/$1 [QSA,L,R=301]
However I'm having trouble figuring out the matching and removal of the ".jpg.html" extension if it exists in combination with the /gallery/ -> /photos/ rewrite. The regex matching I believe is going to be .jpg.html to escape the periods, but how do I write rules to remove the ".jpg.html" extension and rewrite the directory?
RewriteRule ^\.jpg\.html$ $1 [NC]
RewriteRule ^(.*)$ /photos/$1 [QSA,L,R=301]
Edit:
Sorry! I neglected earlier to mention the album URL formats can change (doesn't have to specify a photo, and can include sub albums), I've added some specific examples:
The url rewrite rule also needs to follow:
old: http://example.com/gallery
new: http://example.com/photos
old: http://example.com/gallery/album
new: http://example.com/photos/album
old: http://example.com/gallery/album/subalbum/
new: http://example.com/photos/album/subalbum/
old: http://example.com/gallery/album/subalbum/photo.jpg.html
new: http://example.com/photos/album/subalbum/photo

Try:
RedirectMatch 301 ^/gallery(.*?)(\.(jpe?g|gif|png)\.html)?$ /photos$1

Or alternatively using mod_rewrite:
RewriteRule ^/?gallery/([^/]+)/([^.]+)\.(jpe?g|gif|png)\.html$ /photos/$1/$2 [L,R=301]
You don't need the QSA flag as query strings will automatically get appended if you don't have a ? in your rule's target.

Related

HTACCESS : Redirect (301) thousands of Url's containing directories to simple url's

I need to convert with HTACCESS method tons of URL's allready produced (and already indexed...) by Wordpress for articles containing folders/subfolders to simples URL's without any folder/subfolder name.
Example:
FROM https://www.website.com/Animals/Cats/mycat.html TO https://www.website.com/mycat.html
FROM https://www.website.com/Animals/Dogs/mydog.html TO https://www.website.com/mydog.html
FROM https://www.website.com/Countries/France/bordeaux.html TO https://www.website.com/bordeaux.html
etc...
I already changed permalinks options in Wordpress config. So, now URL's produced are in the good format (Ex: https://www.website.com/bordeaux.html) without any folder name.
My problem is to redirect all OLD Url's to this new format to prevent 404 and preserve the rank.
If tryed to add in my .htacess this line :
RewriteRule ^/(.*)\.html$ /$1 [R=301,L,NC]
I egally tryed RedirectMatch 301 (.*)\.html$ method and it's the same. I'm going crazy with this.
What am i doing wrong and could you help me?
Thanks
RewriteRule ^/(.*)\.html$ /$1 [R=301,L,NC]
The URL-path matched by the RewriteRule pattern never starts with a slash. But you can use this to only match the last path-segment when there are more "folders". And the target URL also needs to end in .html (as per your examples).
So, this can be written in a single directive:
RewriteRule /([^/]+\.html)$ /$1 [R=301,L]
This handles any level of nested "folders". But does not match /foo.html (the target URL) in the document root (due to the slash prefix on the regex), so no redirect-loop.
(No need for any preceding conditions.)
Here the $1 backrefence includes the .html suffix.
Just match the last part of the url and pass it to the redirect:
RewriteRule /([^/]+)\.html$ /$1.html [R=301,L,NC]
It will match any number of directories like:
https://www.website.com/dir1/page1.html
https://www.website.com/dir1/dir2/page2.html
https://www.website.com/dir1/dir2/dir3/page3.html
https://www.website.com/dir1/dir2/dir3/dir3/page4.html

mod rewrite: Rewrite rule mach slashes and change them to get variables

I'm using index.php as my entry script. I want to apply specific Rewrite rules to my htaccess to see my URLs in better format
For now I have succeeded to match my controller & action only by using
RewriteRule ^(.*)$ /mysite/index.php?r=$1 [L]
So I can use any of these URLs
domain.com/users/user => /mysite/index.php?r=users/user
domain.com/contacts/contact = /mysite/index.php?r=contacts/contact
I want to add some additonal get variables to my URLs, to get specific records like
domain.com/users/user/id/10
or
domain.com/contacts/contact/id/2/name/5
Now I can do this only this way:
domain.com/users/user&id=10 = > /mysite/index.php?r=users/user&id=10
domain.com/contacts/contact&id=2&name=5 => /mysite/index.php?r=contacts/contact&id=2&name=5
How can I change my Rewrite rule to support GET variables as well?
Add QSA flag:
RewriteRule ^(.*)$ /mysite/index.php?r=$1 [L,QSA]
QSA stands for Query String Append which make sure to preserve existing query string while adding new query parameters.
Reference: Apache mod_rewrite Introduction

Seo friendly url for specific page

I need to make a specific page url to user friendly
I have a page www.example.com/index.php?route=a/b
Which I want to be rewritten as -> www.example.com/a
I used this rule in htacces but its not working
RewriteRule ^/a$ index.php?route=a/b
Please help
Change your rule to:
RewriteEngine On
RewriteRule ^a/?$ /index.php?route=a/b [L,NC,QSA]
And remember that inside .htaccess leading slash in URI is not matched so ^a/?$ instead of ^/a/?$.
Reference: Apache mod_rewrite Introduction

.htaccess Subdirectory Rewrite Exception

I'm currently consolidating posts on a site we recently acquired that had multiple WordPress installs to manage content, one in the public_html folder and another in a subdirectory, like so:
1. http://domain.com/
2. http://domain.com/another-install/
We're moving all of the content from /another-install/ into the main setup, and using a 301 redirect to remove /another-install/ from all old links like so:
RedirectMatch 301 ^/another-install/(.*) http://domain.com/$1
Resulting in all articles redirecting like so:
http://domain.com/another-install/article-name/
TO
http://domain.com/article-name/
The problem is, we want to keep /another-install/ viewable as a page. With the current redirect, http://domain.com/another-install/ goes to http://domain.com/. Is there any way to add an exception, or rewrite the current rule so that it keeps /another-install/ viewable?
Change your regex from (.*) (which matches 0 or more of any character) to (.+) (which matches 1 or more of any character). That means there would have to be something following /another-install/ in order for there to be a redirect.
You need a RewriteRule to specify exclusions. Add this to your .htaccess file
RewriteCond %{REQUEST_URI} !^/old-install/(index\.wml)?$ [NC]
RewriteRule ^old-install/(.+)$ http://domain.com/$1 [R=301,NC,L]

RewriteCond %{QUERY_STRING} in mod_rewrite (dynamic to static URL) not working

As I have encountered very luck with managing to get querystrings to redirect correctly previously just passing querystring parameters, and the over-arching advice across this site and webmasterworld for querystring redirection seems to be "deal with it as a RewriteCond querystring", I'm trying to use the following type rule for a set of about 10 URLs.
Example URL:
http://www.example.org/training_book.asp?sInstance=1&EventID=139
What I have so far:
RewriteCond %{QUERY_STRING} ^training_book.asp\?sInstance=1&EventID=139
RewriteRule /clean-landing-url/ [NC,R=301,L]
So, what I want to happen is
http://www.site.org/training_book.asp?sInstance=1&EventID=139 301> http://www.site.org/clean-landing-url
but instead what is happening is this:
http://www.site.org/training_book.asp?sInstance=1&EventID=139 301> http://www.site.org/training_book.asp/?sInstance=1&EventID=139
It's appending a forward slash just before the querystring, and then resolving the full URL (obviously, 404ing.)
What am I missing? Is it a regex issue with the actual %{QUERY_STRING} parameter?
Thanks in advance!
EDIT -
Here's where I am so far.
Based upon the advice from #TerryE below, I've tried implementing the following rule.
I have a set of URLs with the following parameters:
http://www.example.org/training_book.asp?sInstance=1&EventID=139
http://www.example.org/training_book.asp?sInstance=2&EventID=256
http://www.example.org/training_book.asp?sInstance=5&EventID=188
etc.
which need to redirect to
http://www.example.org/en/clean-landing-url-one
http://www.example.org/en/clean-landing-url-two
http://www.example.org/en/clean-landing-url-three
etc.
This is the exact structure of the htaccess file I have currently, including the full examples of the "simple" redirects which are presently working fine (note - http://example.com > http://www.example.com redirects enforced in httpd.conf)
#301 match top level pages
RewriteCond %{HTTP_HOST} ^example\.org [NC]
RewriteRule ^/faq.asp /en/faqs/ [NC,R=301,L]
All URLs in this block are of this type. All these URLs work perfectly.
#Redirect all old dead PDF links to English homepage.
RewriteRule ^/AR08-09.pdf /en/ [NC,R=301,L]
All URLs in this block are of this type. All these URLs work perfectly.
The problem is here: I still can't get the URLs of the below type to redirect. Based upon advice from #TerryE, I attempted to change the syntax as below. The below block does not function correctly.
#301 event course pages
RewriteCond %{QUERY_STRING} sInstance=1EventID=139$
RewriteRule ^training_book\.asp$ /en/clean-landing-url-one? [NC,R=301,L]
The output of this is
http://staging.example.org/training_book.asp/?sInstance=1&EventID=139
(this is currently applying to staging.example.org, will apply to example.org)
(I had "hidden" some of the actual syntax by changing it to event_book from training_book in the initial question, but I've changed it back to be as real as possible.)
The the documentation. QUERY_STRING contains the request content after the ?. Your condition regexp should never match. This makes more sense:
RewriteCond %{QUERY_STRING} ^sInstance=1&EventID=139$
RewriteRule ^event_book\.asp$ /clean-landing-url/ [NC,R=301,L]
The forward slash is caused by a different Apache filter (DirectorySlash).