Apache mod_rewrite complex URL regex - regex

i want to make a beautiful URL from a URL like this:
http://example1.com/cimage/webroot/img.php?src=http://example.com/img1.jpg&w=600&h=800&q=60&sharpen&crop-to-fit
So the result URL must be somthing like this:
http://example1.com/cimage/webroot/img.php/http://example2.com/img1.jpg/600*800/60/sharpen/crop-to-fit
Now my problem create a regex for use in apache mod_rewrite.
help me please. thanks...

Based on the discussion in the comments, this is the solution:
RewriteRule ^img.php/(https?):/+(.+?(?:\.jpg|\.png))/(\d+)/(\d+)/(\d+)/?([^\/]*)/?([^\/]*)/?‌​([^\/]*)$ img.php?src=$1://$2&w=$3&h=$4&q=$5&$6&$7&$8
Demo
The logic here is to match img.php/ at the start of the string, then either http or https, :, one or more /s, the host and file name, then the various parameters for size, quality, etc.
Another way to handle this without the complicated regex is to do a simple catch-all rule like this:
RewriteRule ^img.php\/(.+) img.php?url=$1
Then you can do the parsing in PHP, mostly using simpler operations like explode() in the PHP code. This approach makes sense especially if there might be additional operations/parameters; otherwise, your regex starts to have a lot of capturing groups, making it hard to read and maintain.

Related

Help convert Apache rewrite rules to PHP regular expressions

Short story: I am using this technique to auto-version my css and js files by adding a string to the filename with filemtime():
http://w-shadow.com/blog/2012/07/30/automatic-versioning-of-css-js/
I got it up and running perfectly on my local machine (MAMP), but I use WP Engine for my hosting and they are set up on nginx and don't support .htaccess rewrite rules.
They do have a place to enter PHP regular expressions (preg_replace), though, and their instructions look like this:
HTML Post-Processing
A mapping of PHP regular expressions to replacement values which are executed on all blog HTML after WordPress finishes emitting the entire page. The pattern and replacement behavior is in the manner of preg_replace().
The following example removes all HTML comments in the first pattern, and causes a favicon (with any filename extension) to be loaded from another domain in the second pattern:
#<!--.*?-->#s =>
#\bsrc="/(favicon\..*)"# => src="http://mycdn.somewhere.com/$1"
. So I'm wondering how hard it is to convert this rewrite rule to a PHP regular expression:
RewriteRule ^(.*)\.[\d]{10}\.(css|js)$ $1.$2 [L]
And if this would even be doing the same thing as the apache rewrite. the whole point of the technique is to bust the browser cache for css or js files and time they are changed, but without resorting to query strings, which have various drawbacks.
Actually, it's pretty much the same. Take your regex, delimit it, drop it in a string and escape the right things, then take your rewrite rule and use single quotes to make it a string, and you're done. In your example:
$newUrl = preg_replace('/^(.*)\\.[\\d]{10}\\.(css|js)$/', '$1.$2', $url);
This will properly rewrite anything url you give it. However, it sounds like these preg_replaces are being done across a large document, which means your regex there won't do what you think it will. That, however, is a completely separate question. One I won't even guess at, because I don't know what your requirements are. If you need help crafting the regex, please open another question with your specific requirements.
Also: Next time, Check the documentation.

Apache mod_proxy_html Substitute: how to re-use part of regex match? (regex variables?)

[Full disclosure: Cross-post between here and ServerFault, because I believe the audiences (server admins & devs) are distinct enough to warrant asking the question to both separately.]
Hi all,
Have a unique URL-rewriting situation in Apache.
I need to be able to take a URL that starts with
"\u002f[X]"
or
'\u002f[X]"
Where X is the rest of some URL, and substitute the text
"\u002fmeis2\u002f[X]
I'm not sure how the Regex works in Apache -- I think it's the same as Perl 5? -- but even then I'm a little unsure how this would be done. My hunch is that it has to do with Regex grouping and then using $1 to pull the variable out, but I'm entirely unfamiliar with this process in Apache.
Hoping someone can help -- thanks!
You are right. Group the text that you want to re-use with parens, and use $1 in the substitution. Use the following .htaccess file:
RewriteEngine On
RewriteRule ^\u002f(.*) /\u002fmeis2\u002f$1
(I am not certain that mod_rewrite handles unicode escapes, but it seems so from your question.)

Can one use named backreference's in Apache mod_rewrite

All,
I've come across an interesting little quirk in one of my RewriteRules, which I wanted to resolve by the use of named back references. However from what I can see, this is not possible in Apache's mod_rewrite.
I have two incoming urls, each containing a key variable, which need to be rewritten to the same underlying framework action.
Incoming urls:
/users/list/page-2
/users/list/2
Desired rewrite endpoint
/?module=users&action=list&pagenum=2
I would have liked to do something like this
RewriteRule ^/(?P<module>([\w]+))/(?P<action>([\w]+))/(page-)?(?P<pagenum>([\d]+))$ /?module=${module}&action=${action}&pagenum=${pagenum} [L,QSA]
However Apache just doesn't want to play like that at all, and gives me null values in the places of the named backreferences. To get me round the problem I've used numerical references to the captured groups ($1, $2, $4)(but I'm almost halfway to the N=9 apache limit). So this isn't a show stopper for me.
I would just like to know, if named backreferences are available in Apache's mod_rewrite, and if they are, why does my RewriteRule's pattern not match?
Thanks,
Ian
THis might be useful:
https://httpd.apache.org/docs/trunk/rewrite/rewritemap.html
If #superspace's latest answer doesn't work, what I would suggest is routing all links that are not to direct files/directories and route them to an index page. Then setup a routing class which takes in the page name and does manual matching, so you can have your named capture regex array and list the templates or pages you want to feed.
If you have to go this way, let me know and I can offer some code from my classes.
No backreferences it seems, after looking into the mod_rewrite source.
I'd recommend using the RewriteMap option anyway instead of a long list of RewriteRules, as it will be much faster than iterating through a lengthy list.

Regex to match a URL pattern for a htaccess file

I'm really struggling with this regex.
We're launching a new version of our "groups" feature, but need to support all the old groups at the same URL. Luckily, these two features have different url patterns, so this should be easy? Right?
Old: domain.com/group/$groupname
New: domain.com/group/$groupid/$groupname
The current regex I was using in the regex to redirect old groups to the old script was: RewriteRule ^(group/([^0-9/].*))$ /old_group.php/$1 [L,QSA]
It was working great until a group popped up called something like: domain.com/group/2009-neato-group
At which point the regex matched and forced the new group to the old code because it contained numbers. Ooops.
So, any thoughts on a regex that would match the old pattern so I can redirect it?
Patterns it must match:
group/my-old-group
group/another-old-group/
group/123-another-old-group-456/a-module-inside-the-group/
Patterns it cannot match:
group/12/a-new-group
group/345/another-new-group/
group/6789/a-3rd-group/module-abc/
The best way I've thought about it is that it cannot start with "group-slash-numbers-slash", but haven't been able to turn that into a regex. So, something to negate that pattern would probably work.
Thanks
Think the other way around: If the url matches ^group/[0-9]+/.+, then it is a new group, so redirect these first (with a [L] rule). Everything else must be a old group, you can keep your old rules there.
Your design choices make a pure-htaccess solution difficult, especially since group/123-another-old-group-456/a-module-inside-the-group/ is a valid old-style URL.
I'd suggest a redirector script that looks at its arguments, determines if the first part is a valid groupid, and if so, it's a new-style URL takes the user to that group. Else, it is a old-style URL, so hand it off to old_group.php. Something like this:
RewriteRule ^group/(.*)$ /redirector.php/$1 [L,QSA]

RegEx match replace help

I am trying to do a regex match and replace for an .htaccess file but I can't quite figure out the replace bit. I think I have the match part right, but maybe someone can help.
I have this url-
http://www.foo.com/MountainCommunities/Lifestyles/5/VacationHomeRentals.aspx
And I'm trying to turn it into this-
http://www.foo.com/mountain-lifestyle/Vacation-Home-Rentals.aspx
(MountainCommunities/Lifestyles)/\d/(.*)(.aspx)
and then I figured I would have a rewrite rule starting like this-
mountain-lifestyle/$2$3
but I need to take what is in $2 in this instance and rewrite it to place dashes between the words with capital letters. Now I'm stumped.
I think you'll have to do it in two bits... Take out $2, precede every capital (apart from the first) with a -, then use just append the result to http://www.foo.com/mountain-lifestyle/ with a .aspx on the end.
Try this:
RewriteRule ^(([A-Z][a-z]+-)*)([A-Z][a-z]+)(([A-Z][a-z]+)+)(\.aspx)?$ /$1$3-$4 [N]
RewriteRule ^([A-Z][a-z]+-)+[A-Z][a-z]+$ /$0.aspx [R=301]
Note that mod_rewrite uses an internal counter to detect and avoid infinit loops. So your URL may not contain too much words having to be converted (see MaxRedirects option for Apache < 2.1 and LimitInternalRecursion directive for Apache ≥ 2.1).
I don't think what your doing with the capital letters is possible with regex...
You would be better keeping the dashes in the URL and removing the .aspx
eg: http://www.foo.com/MountainCommunities/Lifestyles/5/Vacation-Home-Rentals
This would require the following rule:
^/MountainCommunities/Lifestyles/5/([^/]+)/\?([^/]+) /mountain-lifestyle/$1.aspx?$2 [I]
This also takes into account any querystrings that are sent to the page.
BTW: How are you using .htaccess with IIS?
You can use the regular expression "([A-Z])" on the middle bit "VacationHome", replacing with the regex "-$1" - This will give you "-Vacation-Home-Rentals" - Then you can just chop off the first character, and stick the first part of the URL on the front, and .aspx on the end.
I think the main regex has been written by others, but to match the request name to place dashes (assuming all the file names have a three-name camel cased representation ala 'VacationHomeRentals.aspx':
RewriteRule: ^/MountainCommunities/Lifestyles/\d+/([A-Z][a-z]+)([A-Z][a-z]+)([A-Z][a-z]+)\.aspx$ /mountain-lifestyle/$1-$2-$3.aspx
This is a restricted version of #Gumbo's response, as I have not had a chance to test his recursion. The recursion technique is definitely the best and most usable for any scenario.
I don't think I quite understand what you are trying to do. Why can't you simply search for:
http://www.foo.com/MountainCommunities/Lifestyles/5/VacationHomeRentals.aspx
and replace it with:
http://www.foo.com/mountain-lifestyle/Vacation-Home-Rentals.aspx ?
Or is this a specific example of a patten you are trying to transform?