Is there a better way to do this regex? - regex

I finally figured out a good/easy way to make clean URLs with regex on my site in this format below, however it will require a very large .htaccess file, I know from the post on here that it is supposed to not be to bad on performance to use mod_rewrite but I have never really seen it used where the way I am, with a seperate entry for almost every page of my site.
Below is an example of an entry, there is 2 entries for 1 page, the first entry re-writes
http://www.example.com/users/online/friends/
to
http://www.example.com/index.php?p=users.online.friends
It works great but if the user is not on the first page then there is another thing added to the URL for paging and I had to write another entry to rewrite when this happens, is this the correct way or should these be combined somehow?
RewriteRule ^users/online/friends/*$ ./index.php?p=users.online.friends&s=8
RewriteRule ^users/online/friends/(\d+)/*$ ./index.php?p=users.online.friends&s=8&page=$1
The second one would do this
http://www.example.com/users/online/friends/22/
to
http://www.example.com/index.php?p=users.online.friends&page=22

It depends what you think is more readable, but here's how you could do it with a single rule:
RewriteRule ^users/online/friends(/(\d+))?/*$ ./index.php?p=users.online.friends&s=8&page=$2
(Edited to be more faithful to treatment of trailing slash in original question. Was: RewriteRule ^users/online/friends/((\d+)/*)?$ ./index.php?p=users.online.friends&s=8&page=$2)
Here I've just put "(...)?" around the final part of the url to make it an optional match, and changed the backreference to $2.
Of course, this actually rewrites http://www.domain.com/users/online/friends/ as:
http://www.domain.com/index.php?p=users.online.friends&page=
So your PHP code would have to check whether the page parameter is non-empty.

Yes, that's fine. I guess they could be combined into a single rule but there's not really any need.
You might consider leaving page as part of the URL so instead of:
http://www.domain.com/users/online/friends/22/
just have:
http://www.domain.com/users/online/friends?page=22
and then have one rule something like:
RewriteRule ^users/online/friends/?$ ./index.php?p=users.online.friends&s=8 [L,QSA]
to append the query string
Edit: There are a couple of ways of reducing the number of rewrite rules you have.
Firstly, use wildcards in the search terms, like:
RewriteRule ^users/(\w+)/(\w+)$ /index.php?p=users.$1.$2 [L,QSA]
will reduce quite a number of rules.
Secondly, if you're passing everything through /index.php just consider delegating all requests there:
RewriteRule ^(users/*)$ /index.php/$1 [L,QSA]
That rule uses a third technique: instead of passing the path information via a query string parameter, pass it via the extra path info. That can be accessed via $_SERVER['PATH_INFO'].
That being said, lots of rules isn't necessarily bad. At least it's explicit about all your actions. The thing you have to watch out for is creating a maintenance nightmare however.

# Initial step
RewriteCond %{QUERY_STRING} !(?:^|&)p=
RewriteRule ^([^/]+)/(.+) /$2?p=$1 [QSA]
# Subsequent steps
RewriteCond %{QUERY_STRING} ((?:[^&]*&)*?)p=([^&]*)(.*)
RewriteRule ^([^/]+)/(.+) /$2?%1p=%2.$1%3
# Last step with page number
RewriteRule ^(\d+)/?$ /index.php?page=$1 [QSA,L]
# Last step without page number
RewriteCond %{QUERY_STRING} (?:((?:[^&]*&)*?)p=([^&]*))?(.*)
RewriteRule ^([^/]+)/?$ /index.php?%1p=%2.$1%3 [L]
This would rewrite the URL in several steps:
http://www.domain.com/users/online/friends/22/
http://www.domain.com/online/friends/22/?p=users
http://www.domain.com/friends/22/?p=users.online
http://www.domain.com/22/?p=users.online.friends
http://www.domain.com/index.php?p=users.online.friends&page=22
An easier method would be the following, but would require you to change your scripts:
RewriteRule ^(.*?)(?:/(\d+))?/?$ /index.php?p=$1&page=$2 [QSA,L]
It would do everything in one step, with a little difference:
http://www.domain.com/users/online/friends/22/
http://www.domain.com/index.php?p=users/online/friends&page=22
Adding the s=8 query argument would require more work:
Creating a text-file with the menu numbers for each page.
Adding a RewriteMap directive.
Changing the second-last rule to use the same RewriteCond as the last rule has.
Adding &s=%{menumap:%2|0} to the last two rules.

Related

Rewrite rule with pagination .htaccess

I have a URL like this:
http://example.com/category/title which comes from the link http://example.com/cview.php?url=title
I want to create pagination and to be like http://example.com/category/title/page/1 or
http://example.com/category/title/1
this comes from http://example.com/cview.php?url=title&pageno=1.
I have tried this in .htaccess without success
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^category/([^/]*)$/([^/]+)/?$ /cview.php?url=$2&pageno=$1 [L]
Can anyone help please?
RewriteRule ^category/([^/]*)$/([^/]+)/?$ /cview.php?url=$2&pageno=$1 [L]
You have an erroneous $ (end-of-string anchor) in the middle of the RewriteRule pattern. You also appear to have the backreferences $1 and $2 the wrong way round. You are also allowing an optional trailing slash, yet your example URLs do not use this. (An optional trailing slash potentially creates a duplicate content issue.)
If you allow both /category/title/page/1 and /category/title/1 then you are potentially creating a duplicate content issue. Presumably you are only linking to one of these URL formats?
Since the page number is a "number" then it makes sense to just match numbers, rather than anything - this also helps to avoid conflicts with other directives.
It doesn't look like you need the conditions (RewriteCond directives) that check the request does not map to a file or directory, since I wouldn't expect a request of the form /category/title/page/1 to map to a file or directory anyway?
Try the following instead (without the RewriteCond directives):
RewriteRule ^category/([^/]+)(?:/page)?/(\d+)$ /cview.php?url=$1&pageno=$2 [L]
This matches both /category/title/page/<num> and /category/title/<num>. The optional subpattern (?:/page) is non-capturing, so that it doesn't mess up the numbering of the backreferences.
Bear in mind also that the order of the rules in .htaccess is important in order to avoid conflicts.

URL rewrite to exclude certain HTML files

I need to URL write uppercase URLs to lowercase.
I got that part working.
However, I am having problem. I am interested in excluding from the above *.html?vs=12312312
I tried the following:
RewriteCond %{REQUEST_URI} !^.*\.(html\?vs=) [NC]
But it didn't work.
http://foo.com/com.foo.bar/content/Any.html?vs=12312312
The above should stay the way it is, and not be rewritten.
What's wrong with the rule above? What should be the proper syntax?
Update
I tried the following:
RewriteCond %{REQUEST_URI} !(.*\.html\?vs=.*)$ [NC]
But still no luck.
The query string is not part of the REQUEST_URI, it is stored in QUERY_STRING. So try something like this, which goes before your existing rule:
RewriteCond %{REQUEST_URI} \.html$
RewriteCond %{QUERY_STRING} ^vs=[^&=]+$
RewriteRule ^ - [L]
The reason you need to do it this way (as its own separate rule), rather than putting an exclusion on your existing rule, is because you can't do AND with negative matches of RewriteCond, so putting them on your existing rule as negative matches would prevent it from running if only one applied (.html or ?vs=nnn). To reject when both apply, you need to do it in a separate, positive match like this.
If you have other rules you need to apply to those URLs after this, look at the [S=1] flag (documentation) which will skip the next rule on a match, instead of [L] which says stop processing here after a match (and hence don't apply your subsequent rules for these URLs).
The rule RewriteRule ^ - just says don't change anything, it's used to only apply the effect of the flags.

My apache rewrite only works for the first folder level

We have a website where we show clients creative work we have produced for them. We upload raw assets to a path like this:
x.com/clients/clientName/campaignName/size/
I have a PHP script which adds our branding, contact information and other information and pulls in the raw creative (usually a swf object). It is in this directory x.com/clients/index.php and it accepts a query string parameter ?path so it knows where to look for the creative.
I am trying to do an apache rewrite in .htaccess so that our designers can upload directly to the known folder structure but so that when you go to x.com/clients/clientName/campaignName/size/ it should rewrite to x.com/clients/index.php?path=clientName/campaignName/size/
I am currently using the following rewrite rule, which works for the first folder level e.g. x.com/clients/clientName/ does successfully rewrite, but any subsequent folders do not.
RewriteRule ^clients/([^/\.]+)/?$ /clients/index.php?path=$1 [L]
My RegEx's are terrible, so I'm stuck on what to do. Any help appreciated, thank you kindly.
Your regex is only matching urls like clients/xxxxxx/ because your pattern [^/\.]+ means one or many characters except "/" or "."
With your rule, it can't work for other subdirectories.
You can change your rule by this one
RewriteRule ^clients/(.+)$ /clients/index.php?path=$1 [L]
To avoid internal server error (code 500 which means an infinite loop in this case), you can do it this way
RewriteRule ^clients/index\.php$ - [L]
RewriteRule ^clients/(.+)$ /clients/index.php?path=$1 [L]
Is there a special reason you want to use regex? In my opinion you can just catch everything coming after /clients:
RewriteEngine on
RewriteCond %{REQUEST_URI} !^(.*/)?index\.php$ [NC]
RewriteRule ^clients/(.*)$ /clients/index.php?path=$1 [L]
The second line is to prevents redirect loops, because the index.php is also in the folder /clients and this would cause never ending redirects.

RewriteRule for image file requests only if a query string is present

With my limited regular expression and mod_rewrite abilities, I'm attempting to rewrite certain image requests so I can alter the output with a php script. Here's what I have:
RewriteRule ^(public|uploads)/([A-Za-z0-9/_-]+).(JPEG|JPG|GIF|PNG|jpeg|jpg|gif|png)$ public/images.php?%{QUERY_STRING}&src=$1/$2.$3 [L]
# [ 1 ] [ 2 ] [ 3 ]
This does work, but it's too greedy and doesn't require the query string, which is important - otherwise all images requests would be rewritten. I tried putting a ? or ?(.*) in the rule, and I would either get an internal server error or it didn't seem to solve the problem (most likely because I didn't do it correctly). I also tried %{QUERY_STRING} at the end of the condition, but that didn't seem to affect anything.
Here's what I want to happen:
Any requests for public/ or uploads/...
Followed by any path to an image (file extension case insensitive)...
Followed by a query string...
...should rewrite to public/images.php with the original query string, and add one aditional parameter: src, which contains the actual path to the image (the rewritten part).
Extra "would-be-nice", but not necessary: Restrict the rule to only rewrite the url if the query string contains at least one item from a set of parameters. For example, only if one of the width=, height= or contrast= params are present. If this makes things bloated or complicated, I'm not worried about it.
So for example, a request for:
uploads/images/my_folder/test.jpg?width=320&height=220
Should be served by:
public/images.php?width=320&height=220&src=public/images/my_folder/test.jpg
The .htaccess file is in my root directory, as well as the public and uploads directories.
I want to avoid absolute urls, because I want this to be portable without needing to edit. I've done a good deal of googling and reading related SO posts, and still can't figure this one out. How can I patch this rule to do what I want, or is there a better way to do write this altogether?
Edit: Just want to note that this rule worked for me previously:
RewriteRule ^(public|uploads)/([A-Za-z0-9/_-]+).(JPEG|JPG|GIF|PNG|jpeg|jpg|gif|png)/([0-9]+)$ public/images.php?width=$4&src=$1/$2.$3
...but only for requests like uploads/my_folder/image.jpg/280 - I used the 280 as the width, but now I want to accept combinations of multiple parameters in no particular order:
Two approaches:
1. Add a condition to only rewrite when query string is not empty (can be anything):
RewriteCond %{QUERY_STRING} !^$
RewriteRule ^(public|uploads)/([A-Za-z0-9/_-]+)\.(jpe?g|gif|png)$ public/images.php?src=$1/$2.$3 [NC,QSA,L]
2. Add a conditions to only rewrite when query string is not empty and has at least one of those parameters present (but the value can be empty):
RewriteCond %{QUERY_STRING} (^|&)width=([^&]*)(&|$) [OR]
RewriteCond %{QUERY_STRING} (^|&)height=([^&]*)(&|$) [OR]
RewriteCond %{QUERY_STRING} (^|&)contrast=([^&]*)(&|$)
RewriteRule ^(public|uploads)/([A-Za-z0-9/_-]+)\.(jpe?g|gif|png)$ public/images.php?src=$1/$2.$3 [NC,QSA,L]
I have not really tested #2 .. but should work fine (copied from fully working code).
NOTES:
You can replace ^(public|uploads)/([A-Za-z0-9/_-]+)\.(jpe?g|gif|png)$ by ^((?:public|uploads)/(?:[A-Za-z0-9/_-]+)\.(?:jpe?g|gif|png))$ .. and then instead of src=$1/$2.$3 use just src=$1
Alternatively -- replace ^(public|uploads)/([A-Za-z0-9/_-]+)\.(jpe?g|gif|png)$ by ^(?:public|uploads)/[A-Za-z0-9/_-]+\.(?:jpe?g|gif|png)$ and then use src=${REQUEST_URI} -- the only difference that src= parameter will start with leading slash.

How can I improve my .htaccess mod_rewrite stuff?

I've created the following .htaccess file after hours of work,
Everything seems to be working properly, however I'm new to mod_rewrite, and I think my code is amateurish, so I'm looking for things to improve.
For example I thought if I use [L] at the end of a rule, the rest of rewrites will be ignored, but looking at the rewrite logs I see that they are not, there are multiple unwanted pattern matchings that certainly will slow everything down.
Also I have a book that says [C] will chain rewrite conditions, but my apache throws
http://pastebin.com/62JyBXdS
The [L] flag does indeed prevent further rules from processing, however the rewritten url could be passed back through all of your rules a second time hence the multiple entries in your log - see the manual page http://httpd.apache.org/docs/2.2/rewrite/flags.html#flag_l
Alot of your rewrite rules do the same thing with just different data and could be compacted down to a single regex, I've done a few but you could do the entire list.
RewriteRule ^/([dprcmlfb]|members|lnli|freelisting)/(.*)$ /$1\.php/$2 [L]
if you also add a RewriteCond of somethine like
RewriteCond %{REQUEST_URI} !^/[^/]+\.php
to prevent the rule firing for a php file request
You could add the MultiViews option instead of rules like the rule below:
RewriteRule ^/d/(.*)$ /d\.php/$1 [L]
MultiViews would correctly interpret /d/stuff as a request to d.php if no other rule interferes.