SEO Friendly URL's Through -> modrewrite - regex

I am attempting SEO friendly-ize my URL's.
Sample URL (that I would like):
http://sampleurl.com/FILENAME/VARIABLE/
What I would like to transpose this to is:
http://sampleurl.com/FILENAME.php?x=VARIABLE
My .htaccess:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)/(.*)/$ $1.php?l=$2 [L]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)/$ $1.php [L]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule . index.php [L]
# Proper regex -> ([a-zA-Z0-9_-]+)
</IfModule>
My challenges are:
1) The trailing slashes: with my regex I have to have a trailing slash. When I change the regex to read ^(.*)/(.*)/?$ - It adds a ".php" to the end of the variable. And, if I leave the ? off - the trailing slash is required - or, it resolves to the "index.php" result
2) Same in the second RewriteRule - with the ? before the slash = Internal Server Error - without the ? = all good.
So everything works now, except I must have that trailing slash.
Seems like I am missing something pretty easy here - and would appreciate someone pointing it out.

Related

.htaccess - Recursively mapping slashes to underscores

G'day,
As the title says, I'm trying to make the url formatted as: /this/is/mah/page/strucutre to the file this_is_mah_page_structure.php.
Now, I have that working - except that I can't know the depth of the structure. Thus I need to have some recursion going on.
The working snipit I have for one and two replacements is:
# map the slashes to underscores
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/([^/]+)/?$ $1_$2 [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/([^/]+)/([^/]+)/?$ $1_$2_$3 [L]
I have found this: Htaccess recursively replace underscores with hyphens which states how to do it (for different characters); and this https://www.askapache.com/htaccess/rewrite-underscores-hyphens-seo-url/ which shows similar code. But I can't make it work.
Also, I found that .htaccess recursion can cause 500 errors - somewhere in the order of 10 loops - based on LimitInternalRecursion. From that, I tried replacing multiple with the one pass, but that had the unexpected outcome of doubling up the url.
This is the example from the other StackOverflow answer to give you an idea of what I've been testing. Any ideas / thoughts / direction?
(Rewrites underscores with hyphens - so it's the wrong side, but it's a start)
RewriteEngine On
# if there is only one underscore then repalce it by - and redirect
RewriteRule ^([^_]*)_([^_]*)$ /$1-$2 [L,R=302]
# if there are more than one underscores then "repeatedly" repalce it by - and set env var USCORE
RewriteRule ^([^_]*)_([^_]*)_(.*) $1-$2-$3 [E=USCORE:1]
# if USCORE env var is set then redirect
RewriteCond %{ENV:REDIRECT_USCORE} =1
RewriteRule ^([^_]+)$ /$1 [L,R=302]
-- Edit and full solution --
So it finally dawned on me through this post, that the presence / absence of R=302 (or 301) is what shifts the idea from "Map URL to file" (ie: transparent); to "Redirect". It should have been obvious, but I thought there was more too it.
The final solution is as #anubhava suggested, but I removed the ,R=301 and added RewriteCond %{REQUEST_FILENAME} !-f and RewriteCond %{REQUEST_FILENAME} !-d to prevent collision between the rules and real files/directories of the same name. (i.e. scripts.js, /images/yomomma.jpg, etc)
# when there are more than one / then "recursively" replace it by _
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/([^/]+/.+)$ $1_$2 [N,DPI]
# when there is only one / then replace it by _ and redirect
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/([^/]+)$ /$1_$2 [NE,L]
I had posted that answer about 5 years ago though now I think it can be further improved.
You may use these rules in site root .htaccess:
RewriteEngine On
# if request is for a file [OR]
RewriteCond %{REQUEST_FILENAME} -f [OR]
# if request is for a directory
RewriteCond %{REQUEST_FILENAME} -d
# then ignore all rules below
RewriteRule ^ - [L]
# when there is only one / then replace it by _ and redirect
RewriteRule ^([^/]+)/([^/]+)$ /$1_$2 [NE,L,R=301]
# when there are more than one / then recursively replace / by _
RewriteRule ^([^/]+)/(.+)$ $1_$2 [N,DPI]
Change R=301 rule to RewriteRule ^([^/]+)/([^/]+)$ /$1_$2 [L] if you don't want an external redirect.

How to do this complicated thing with mod_rewrite?

I'm hopeless when it comes to regex and/or the RewriteEngine, so my hours of researching and trying things have been pretty fruitless so far.
I'm trying to use the RewriteEngine to accomplish behavior that will follow these rules:
If the requested URL...
...points to an existing file e.g. domain.com/existing_file.ext
do no rewrites
...is empty, or contains only trailing slash(es) e.g. domain.com/
rewrite to index.php?var=example
...points to an existing directory that is not root (with or without trailing slashes) e.g. domain.com/existing_directory
rewrite to index.php?var=REQUESTED_DIRECTORY_PATH/example where REQUESTED_DIRECTORY_PATH is everything after domain.com (preferably always without a trailing slash)
...is not empty, but doesn't point to an existing file or directory e.g. domain.com/no_such_file_or_directory
rewrite to index.php?var=REQUESTED_URL, where REQUESTED_URL is everything after domain.com
This is what I've got so far:
# /
RewriteRule ^$ index.php?var=example [QSA,L]
# /directory_name/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+[^/])$ /index.php?var=$0/example [QSA,L]
# /not_a_valid_file_or_dir/
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^(.+)$ index.php?var=$0 [QSA,L]
Which to me seems to almost do what I want, except for when I try to access domain.com/existing_directory (with or without a trailing slash). In this case I get redirected to domain.com/existing_directory/ (with a slash), while I would like to be end up at domain.com/index.php?var=existing_directory/example.
Thanks to helpful comments from Bananaapple, and a bit of Googling, I managed to accomplish what I wanted.
Firstly, I had to turn DirectorySlash off, and secondly I needed to remove the [^/] from the regex. So the final relevant code would look something like this:
DirectorySlash Off
# /
RewriteRule ^$ index.php?var=example [QSA,L]
# /directory_name/
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^(.+)$ /index.php?var=$0/example [QSA,L]
# /not_a_valid_file_or_dir/
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-l
RewriteRule ^(.+)$ index.php?var=$0 [QSA,L]
Thanks for the help.

Using htaccess to force a trailing slash before the ? with a query string?

I have the following in my htaccess file:
RewriteEngine On
RewriteBase /
# Check to see if the URL points to a valid file
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# Trailing slash check
RewriteCond %{REQUEST_URI} !(.*)/$
# Add slash if missing & redirect
RewriteRule ^(.*)$ $1/ [L,R=301]
# Check to see if the URL points to a valid file
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# Send to index.php for clean URLs
RewriteRule ^(.*)$ index.php?/$1 [L]
This does work. It hides index.php, and it adds a trailing slash... except when there is a query string.
This URL:
http://example.com/some-page
gets redirected to:
http://example.com/some-page/
but this URL:
http://example.com/some-page?some-var=foo&some-other-var=bar
does not get redirected. I would like for the above to be sent to:
http://example.com/some-page/?some-var=foo&some-other-var=bar
I've reached the limits of my understanding of redirects with this. If you have a working answer, I would really appreciate a walkthrough of what every line is doing and why it works. Double bonus awesomeness for an explanation of why what I have right now doesn't work when there is a query string involved.
Try adding a [QSA] to the end of the last Redirect rule to preserve the original query string as below
# Send to index.php for clean URLs, preserve original query string
RewriteRule ^(.*)$ index.php?/$1 [L,QSA]
a walkthrough of what every line is doing and why it works.
See my comments below
#turn mod_rewrite engine on.
RewriteEngine On
#set the base for urls here to /
RewriteBase /
### if the is not a request for an existing file or directory
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
### and the URI does not end with a /
RewriteCond %{REQUEST_URI} !(.*)/$
### redirect and add the slash.
RewriteRule ^(.*)$ $1/ [L,R=301]
### if the is not a request for an existing file or directory
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# rewrite to index.php passing the URI as a path, QSA will preserve the existing query string
RewriteRule ^(.*)$ index.php?/$1 [L,QSA]
I believe that if you change this:
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ $1/ [L,R=301]
to this:
RewriteCond %{REQUEST_URI} !^([^?]*)/($|\?)
RewriteRule ^([^?]*) $1/ [L,R=301]
then it should do what you want.
The changes I made are:
In both rewrite-condition and -rule, I changed (.*) and ^(.*) to ^([^?]*), to ensure that, if there's a query-string, then it is not included in either regex. ([^…] means "any character that is not in …", so [^?] means "any character that is not a question mark".)
In the rewrite-condition, I changed $ to ($|\?), so as to match either end-of-URL or end-of-part-before-the-query-string.
In the rewrite-rule, I dropped the $, since it was no longer needed.

.htaccess with trailing slash redirects to main page

What I have is http://site.com/index.php?i=1
and want to work with:
[1] http://site.com/about and
[2] http://site.com/about/ (notice slash at the end)
I use following rules:
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)/?$ index.php?i=$1 [QSA,L]
And this works only for [1] but if I put / at the end it redirects to http://site.com
I tried different options but or I get not found or server error or works only one of those. Any idea?
If you want redirection for URI /about/ to /index.php?i=1 then your rule should be:
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^about/?$ index.php?i=1 [QSA,L,NC]
Not sure why you are match everything by using (.*) there.

rewrite rule for WordPress 3.3 permalinks is not working

Every since an upgrade to WordPress 3.3 URLs are not redirecting as they should.
Changed: domain.com/2010/10/postname/ to: domain.com/postname/
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/[0-9]{4}/[0-9]{2}/(.+)$ /$1 [NC,R=301,L]
The problem was due to the leading slash and not using $3
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([0-9]{4})/([0-9]{1,2})/(.+)$ /$3 [NC,R=301,L]
There's a script here you can use to generate .htaccess rules if you want to change permalinks to the /%postname%/ structure.
http://yoast.com/change-wordpress-permalink-structure/
My permalinks were exactly the same as yours, I used this tool to change them and it is working well.
The last rule will never get applied if the previous rule matches. Assuming that the http://domain.com/2010/10/postname/ request doesn't match a file or directory, the RewriteRule . /index.php [L] is going to rewrite the URI to /index.php thus it'll never get to your rule. Try moving your rule up to the top, just below RewriteBase /, and duplicate the !-f/!-d conditions, so that it looks like this:
RewriteBase /
# for 301 redirect
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/[0-9]{4}/[0-9]{2}/(.+)$ /$1 [NC,R=301,L]
# the rest of the rules
RewriteRule ^atom.xml$ feed/ [NC,R=301,L]
RewriteRule ^rss.xml$ feed/ [NC,R=301,L]
RewriteRule ^rss2.xml$ feed/ [NC,R=301,L]
RewriteCond %{HTTP_USER_AGENT} !FeedBurner [NC]
RewriteCond %{HTTP_USER_AGENT} !FeedValidator [NC]
RewriteRule ^feed/?([_0-9a-z-]+)?/?$ http://feeds.feedburner.com/handle [R=302,NC,L]
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
Also, if this is in an .htaccess file, you need to remove the leading slash in the rule match so that it looks like this: ^[0-9]{4}/[0-9]{2}/(.+)$