.htaccess rewrite - regex

I don't know if this is the right area, but here goes:
I have a RewriteRule
RewriteRule ^(eScience/)?(\w+)/RENDER/(\d+)/(\d+)/P(\d+)\.html$ /RENDER/escience/kids/1016/2063/test.html [L,NC]
that works fine because I've hardcoded the IDs in. Now when I do something like
RewriteRule ^(eScience/)?(\w+)/RENDER/(\d+)/(\d+)/P(\d+)\.html$ /RENDER/escience/kids/$2/2063/test.html [L,NC]
The rewrite doesn't work, I get page not found. The really odd part is that $4 works, so if I do something like
RewriteRule ^(eScience/)?(\w+)/RENDER/(\d+)/(\d+)/P(\d+)\.html$ /RENDER/escience/kids/1016/$4/test.html [L,NC]
it works, but anything 3 and under doesn't work. Any ideas? The URL that I am using is
http://www.escience.ca/kids/RENDER/1016/2063/P2063.html
As you can see, $3 and $4 are the exact same IDs, so that's why my third example works.

Look at your regex groups:
RewriteRule ^(eScience/)?(\w+)/RENDER/(\d+)/(\d+)/P(\d+)\.html$ /RENDER/escience/kids/$2/2063/test.html [L,NC]
$1 $2 $3 $4 $5
It should be obvious why it doesn't work - $2 is not the number you expected. Maybe you should use named groups for complex regular expressions if you loose track of the numbering. You can exclude regex groups from being grouped by using the ?: operator, by the way (for example "(?:ungrouped)(dollar1)(dollar2)").

Related

How do I find the 20th regex match group?

I am doing a rewriterule inside of my .htacess folder in one of my htdocs folders.
The rewriterule looks something like this:
RewriteRule ^index/(blah)/(blah2)/(blah3)..../(blah20)
^^^The above code looks like bad practice--don't worry about that.
Anyways, I heard before that ${20} was the correct way to access the 20th match group in regex, but even though in regex101 my 20th match group is matching blah20, whenever I print out the 20th capture group, I just get ${20}.
Why is this? Am I correctly accessing two digit match groups?
Edit--real rewriterule:
RewriteRule ^a/([\d]*)/(b/([\d]{2}:[\d]{2}:[\d]{2})/?)?(c/(\w*)/?)?(d/([\w]
{6})/?)?(e/([\w]{6})/?)?(f/([\w]{6})/?)?(g/([\w]{6})/?)?(h/([\w]{6})/?)?
(i/([\w]{6})/?)?(j/([\w]{6})/?)?(k/([\w]{6})/?)?(l/([\w]{6})/?)?(m/([\w]
{6})/?)? /index.php?a=$1&b=$3&c=$5&d=$7&e=$9&f=${11}&g=${13}&h=${15}&i=${17}&
j=${19}&k=${21}&l=${23}&m=${25} [L]
You cannot use back-reference number greater than 9 as per official mod_rewrite documentation.
From Manual:
RewriteRule back-references: These are back-references of the form $N (0 <= N <= 9). $1 to $9 provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions. $0 provides access to the whole string matched by that pattern.
If you are dealing with so many back-references then it is better to pass full URI after index/ to index.php and use explode inside the php code:
RewriteRule ^index/(.+)$ index.php?q=$1 [L,QSA,NC]
For example like this:
RewriteRule \^index(?:\/\w+){5}\/(blah6)
Will match 6th folder in the url.

Rewrite URL regex: different filenames to same URL structure

e.g. from
A: www.example.com/food.php
B: www.example.com/food2.php?type=bacon
C: www.example.com/food2.php?type=tomato
It's easy to get:
A: www.example.com/food
B: www.example.com/food2/bacon
C: www.example.com/food2/tomato
But how about getting the following from my original URL structure?
B: www.example.com/food/bacon
C: www.example.com/food/tomato
Do I need to keep the same filename to get the desired URLs or is there some regex to do it without causing problems?
I don't have much experience with regex, just simple rewrites from templates.
Use this:
RewriteEngine On
RewriteRule ^food/([^/]*)$ /food2.php?type=$1 [L]
This will leave you with the URLS:
www.example.com/food/bacon
www.example.com/food/tomato
Make sure you clear your cache before testing this.
The rewrite config should look like:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(www.example.com)$
RewriteCond "%{QUERY_STRING}" "\/food"
RewriteRule "(food)\/([a-z]+)$" "$12\.php\?type=$2"
Please refer to mod_rewrite documentation for more details
Please don't be confused with "$12" backreference -- mod_rewrite only offers one-digit references from $1 to $9 (and $0 for a whole matched string), hence it is reference to first (group) followed with "2".
Do you want this? I used vim's substitution command.

rewrite rule issue in htaccess

I have the following rule which is working
RewriteRule ^(.+?)/(step)/([0-9]+)/(id)/([0-9]+)/(start)/([0-9]+)/(end)/
([0-9]+)/?$ index.php?url=$1&$2=$3&$4=$5&$6=$7&$8=$9 [NC,L,QSA]
Now I wanted to add another param at the end of the string which is (ansid) so I did in the following way but for some reason it is not picking up the ansid.
RewriteRule ^(.+?)/(step)/([0-9]+)/(id)/([0-9]+)/(start)/([0-9]+)/(end)
/([0-9]+)/(ansid)/([0-9]+)/?$ index.php?url=$1&$2=$3&$4=$5&$6=$7&$8=$9&$10=$11
[NC,L,QSA]
$10 and $11 won't work because as per Apache mod_rewrite manual:
RewriteRule backreferences:
These are backreferences of the form $N (0 <= N <= 9). $1 to $9 provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions. $0 provides access to the whole string matched by that pattern.
You need to refactor your rule to use backreference upto $9
Your rule can be possibly rewritten as:
RewriteRule ^(.+?)/(step)/([0-9]+)/(id)/([0-9]+)/(start)/([0-9]+)/end/([0-9]+)/ansid/([0-9]+)/?$ index.php?url=$1&$2=$3&$4=$5&$6=$7&end=$8&$ansid=$9 [NC,L,QSA]

mod_rewrite: match string within URL, which regex to chose?

I would like to use mod_rewrite to capture a string within brackets in my URL and do a redirect.
My URL:
something?var_a=A&var_b=(B)&var_c=C
my .httaccess file with the regex:
RewriteEngine on
RewriteRule ^/?.+var_b=\((.*)\)$ somedir/$1 [R]
I just would like to capture what's in between the round brackets, so my redirect should look something like this: somedir/B
I test my regex at http://htaccess.madewithlove.be/ but I get no match.
I don't know what I am missing here, even if I try much simpler regexes, e.g. .+var_b(.*)$ I get no match. Only if my regex was looking for a pattern at the beginning, I get a match, so for example the regex something(.*)$ works.
What am I missing here?
RewriteEngine On
RewriteCond %{QUERY_STRING} (^|&)var_b=\((.*?)\)(&|$) [NC]
RewriteRule ^.*$ somedir/%2? [R]
The reason is that RewriteRule does not receive the ?x=y part of the query. The %2 variable refers to the pattern from the last RewriteCond, while $2 would refer to the pattern from this RewriteRule. The ? at the end prevents the query part ?x=y from being automatically appended at the end of the result.
The (^|&) and (&|$) in the pattern guarantee that var_b=(B) is the complete parameter and not a part of it. Without these, the pattern would also match ?xyzvar_b=(B) or ?var_b=(B)xyz. With these, it will only match ?var_b=(B) or ?a=b&var_b=(B)&x=z etc.

Cannot match literal dot with htaccess regex

The problem: I cannot figure out how to match a literal dot in my expression so I could rewrite query strings containing dots. First I tried something like this:
RewriteRule ^([\.\w]+)$ index.php?url=$1 [L]
I have a php script:
echo "url is: ".$_GET['url'];
which should, in theory, output anything that I write in my query. But for any query containing only letters and dots, my script always outputs:
url is: index.php
I've tried these expressions as well:
^(.+)$
^([.\w]+)$
And the result is the same.
So the question is: are my expressions wrong or does this have something to do with server's config?
It looks like there is another request which is processed before the rule is applied, if I use a rule which matches less than index.php (e.g. .. for matching xy), the result is as expected: xy. With more relaxing rules like .* or .+ it fails. x.* works fine however.
You can add another condition to ignore requests like index.php:
RewriteCond %{REQUEST_FILENAME} !index\.php$
RewriteRule ^(.+)$ index.php?url=$1 [L]
This was tested/ debugged with:
<?php
printf("url is: %s <br>\n", htmlspecialchars(filter_input(INPUT_GET, 'url')));
echo "<pre>",htmlentities(print_R($_SERVER, 1));