How do I find the 20th regex match group? - regex

I am doing a rewriterule inside of my .htacess folder in one of my htdocs folders.
The rewriterule looks something like this:
RewriteRule ^index/(blah)/(blah2)/(blah3)..../(blah20)
^^^The above code looks like bad practice--don't worry about that.
Anyways, I heard before that ${20} was the correct way to access the 20th match group in regex, but even though in regex101 my 20th match group is matching blah20, whenever I print out the 20th capture group, I just get ${20}.
Why is this? Am I correctly accessing two digit match groups?
Edit--real rewriterule:
RewriteRule ^a/([\d]*)/(b/([\d]{2}:[\d]{2}:[\d]{2})/?)?(c/(\w*)/?)?(d/([\w]
{6})/?)?(e/([\w]{6})/?)?(f/([\w]{6})/?)?(g/([\w]{6})/?)?(h/([\w]{6})/?)?
(i/([\w]{6})/?)?(j/([\w]{6})/?)?(k/([\w]{6})/?)?(l/([\w]{6})/?)?(m/([\w]
{6})/?)? /index.php?a=$1&b=$3&c=$5&d=$7&e=$9&f=${11}&g=${13}&h=${15}&i=${17}&
j=${19}&k=${21}&l=${23}&m=${25} [L]

You cannot use back-reference number greater than 9 as per official mod_rewrite documentation.
From Manual:
RewriteRule back-references: These are back-references of the form $N (0 <= N <= 9). $1 to $9 provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions. $0 provides access to the whole string matched by that pattern.
If you are dealing with so many back-references then it is better to pass full URI after index/ to index.php and use explode inside the php code:
RewriteRule ^index/(.+)$ index.php?q=$1 [L,QSA,NC]

For example like this:
RewriteRule \^index(?:\/\w+){5}\/(blah6)
Will match 6th folder in the url.

Related

Remove double dashes from url using mod_rewrite module in htaccess

I am using a rewriteRule to fix some urls from an old webshop so to make them available within a new webshop (relaunch). This is the regex.
#change period to dash
RewriteRule "^(.*)/([^.]*)\.+([^.]*\..*)$" $1/$2-$3 [L,NC]
RewriteRule "^(.*)/([^.]*)\.([^.]*)$" $1/$2-$3 [L,NC,R=302]
The idea is to convert periods in the url to a single dash.
/Biertischhussen/Dena-Biertischhusse-3tlg.-Set-Arcade-50x220cm-ecru-lang
/Biertischhussen/Dena-Biertischhusse-3tlg-Set-Arcade-50x220cm-ecru-lang
The rewriteRule only works 80% though, because it produces double dashes... How can I fix this?
/Biertischhussen/Dena-Biertischhusse-3tlg--Set-Arcade-50x220cm-ecru-lang/
If you add a consuming -* pattern after \.+, all existing - symbols will get matched, and thus will get removed as a result of the replacement (since they are not captured, i.e. they are not inside (...)).
Use
"^(.*)/([^.]*)\.+-*([^.]*\..*)$"
^^
and
"^(.*)/([^.]*)\.-*([^.]*)$"
^^

rewrite rule issue in htaccess

I have the following rule which is working
RewriteRule ^(.+?)/(step)/([0-9]+)/(id)/([0-9]+)/(start)/([0-9]+)/(end)/
([0-9]+)/?$ index.php?url=$1&$2=$3&$4=$5&$6=$7&$8=$9 [NC,L,QSA]
Now I wanted to add another param at the end of the string which is (ansid) so I did in the following way but for some reason it is not picking up the ansid.
RewriteRule ^(.+?)/(step)/([0-9]+)/(id)/([0-9]+)/(start)/([0-9]+)/(end)
/([0-9]+)/(ansid)/([0-9]+)/?$ index.php?url=$1&$2=$3&$4=$5&$6=$7&$8=$9&$10=$11
[NC,L,QSA]
$10 and $11 won't work because as per Apache mod_rewrite manual:
RewriteRule backreferences:
These are backreferences of the form $N (0 <= N <= 9). $1 to $9 provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions. $0 provides access to the whole string matched by that pattern.
You need to refactor your rule to use backreference upto $9
Your rule can be possibly rewritten as:
RewriteRule ^(.+?)/(step)/([0-9]+)/(id)/([0-9]+)/(start)/([0-9]+)/end/([0-9]+)/ansid/([0-9]+)/?$ index.php?url=$1&$2=$3&$4=$5&$6=$7&end=$8&$ansid=$9 [NC,L,QSA]

mod_rewrite: match string within URL, which regex to chose?

I would like to use mod_rewrite to capture a string within brackets in my URL and do a redirect.
My URL:
something?var_a=A&var_b=(B)&var_c=C
my .httaccess file with the regex:
RewriteEngine on
RewriteRule ^/?.+var_b=\((.*)\)$ somedir/$1 [R]
I just would like to capture what's in between the round brackets, so my redirect should look something like this: somedir/B
I test my regex at http://htaccess.madewithlove.be/ but I get no match.
I don't know what I am missing here, even if I try much simpler regexes, e.g. .+var_b(.*)$ I get no match. Only if my regex was looking for a pattern at the beginning, I get a match, so for example the regex something(.*)$ works.
What am I missing here?
RewriteEngine On
RewriteCond %{QUERY_STRING} (^|&)var_b=\((.*?)\)(&|$) [NC]
RewriteRule ^.*$ somedir/%2? [R]
The reason is that RewriteRule does not receive the ?x=y part of the query. The %2 variable refers to the pattern from the last RewriteCond, while $2 would refer to the pattern from this RewriteRule. The ? at the end prevents the query part ?x=y from being automatically appended at the end of the result.
The (^|&) and (&|$) in the pattern guarantee that var_b=(B) is the complete parameter and not a part of it. Without these, the pattern would also match ?xyzvar_b=(B) or ?var_b=(B)xyz. With these, it will only match ?var_b=(B) or ?a=b&var_b=(B)&x=z etc.

mod_rewrite problems: negation

I'm trying to understand mod_rewrite better and have one particular problem I think I need to get my head round first.
I am rewriting http://www.somesite.tld/a/b/c to index.php?path=a/b/c using the following
RewriteRule ^(?!index.php)(.*)$ index.php?path=$1 [NC,L]
An equivalent rewrite would, in this case, be
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?path=$1 [NC,L]
This does not work without the RewriteCond -- path=index.php would be the result without specifically ignoring files or saying 'not index.php'. Why is this?
Also, what is the ?! and ?: syntax that I sometimes see used? I do not understand the use of the ? when it is not prefixed by anything.
And why, in the first RewriteRule above, do the second pair of brackets return a match for $1?
Cheers
(?= ...) and (?! ...) is special syntax in Perl regular expressions and in PCRE, which is the regex library that Apache uses. They are, respectively, positive and negative lookahead assertions: they match an empty string if the text after it matches or does not match the content in the brackets.
They are non-capturing, so they don't define any $n (it would be pointless, since they match an empty string). (?: ...) is also non-capturing, it is used to group subexpressions.
Your first rule should work in .htaccess (but not in a virtual host configuration file), though it would be more correct to write it as
RewriteRule ^(?!index\.php$)(.*)$ index.php?path=$1 [L]
Perhaps another rule is interacting with it. You can check what exactly is being matched and rewritten with RewriteLog and RewriteLogLevel.
"!" means negation. Like a = 1 (a is equal one) a != 1 (a is not equal one);
"f" means file. So if you use together with "!", like "!-f" would be something "file does not exist". the links below may help you better:
http://www.askapache.com/htaccess/htaccess.html
http://net.tutsplus.com/tutorials/other/using-htaccess-files-for-pretty-urls/
http://corz.org/serv/tricks/htaccess2.php

mod_rewrite replace '_' with '-'

I'm almost there with a mod_rewrite rule, but I've caved in :)
I need to rewrite
country/[countryname].php
to
country/[countryname]/
however, [countryname] may have an underscore like this: 'south_africa.php' and if it does I want to replace it with a hypen: 'south-africa/'
I also want to match if the country has numbers following it: 'france03.php' to 'france/'
Heres my rule, its almost there but its still adding a hyphen even if there is no second part after the underscore.
RewriteRule ^country/(.*)_(.*?)[0-9]*\.php$ country/$1-$2 [R=301,L]
so currently 'country/south_.php' becomes 'country/south-/'
Can someone please help me find the missing piece of the puzzle? Thanks.
Try this:
RewriteRule ^country/([^_]*)_([^_]*?)\d*\.php$ country/$1-$2 [R=301,L]
This rule will match urls with a single underscore - you'll need a different rule for more underscores or none.
If you want to make sure $2 contains only letter and isn't empty, change ([^_]*?) it to ([a-zA-Z]+).
Alternatively you could do it over several passes:
# If request is for something in "country/"
RewriteCond %{REQUEST_URI} ^country/.+\.php$
# Replace underscore and digits with (single) hyphen
RewriteRule [_0-9]+ \-
# Remove extension (and possible trailing hyphen)
RewriteRule ^(.*)-?\.php$ $1
# Final rewrite
RewriteRule ^country/(.*)$ country/$1 [R=301,L]
Untested ... and not necessarily "pretty" :)