.htaccess Regexp start with or - regex

I have declare Rewrite rule if url start with 'katalogas/imone' or 'imone' I try that:
RewriteRule ^(katalogas/imone|imone)/(.*) http://google.lt
In statment (katalogas/imone|imone) is something wrong, because '/' means that 2 argument is not important, how to escape '/'. I mean it must start 'katalogas/imone' or 'imone'.

I don't see a problem with your regular expression. The only optimization could be to refactor the two prefixes into one with an optional part
RewriteRule ^((?:katalogas/)?imone)/(.*) http://google.lt
and since you don't use the captured parts, you can further simplify it to
RewriteRule ^(?:katalogas/)?imone/ http://google.lt
Another approach could be to just use two separate rules
RewriteRule ^katalogas/imone/ http://google.lt
RewriteRule ^imone/ http://google.lt

Related

File .htaccess not working with URL that has a dash and 2nd RewriteRule not applied

I have a problem with .htaccess file, as I understand RewriteRule help to rewrite the URL. But when I try the following 2 cases it doesn't work.
#1 The first RewriteRule works but the second doesn't work
RewriteRule ^([a-zA-Z0-9_-]+)$ index.php?idcat=$1 [L] #working
RewriteRule ^([a-zA-Z0-9_-]+)$ index.php?idl=$1 [L] #not working
#2 The RewriteRule doesn't work with dash but works with slash and underscore.
RewriteRule ^([a-zA-Z0-9_-]+)-([a-zA-Z0-9_-]+)$ index.php?idl=$1&iddis=$2 [L] #not working
RewriteRule ^([a-zA-Z0-9_-]+)_([a-zA-Z0-9_-]+)$ index.php?idl=$1&iddis=$2 [L] #working
RewriteRule ^([a-zA-Z0-9_-]+)/([a-zA-Z0-9_-]+)$ index.php?idl=$1&iddis=$2 [L] #working
So how to fix these problems? Does anyone have any suggestions for me?
#1 The first Rewriterule works but the second doesn't work
RewriteRule ^([a-zA-Z0-9_-]+)$ index.php?idcat=$1 [L] #working
RewriteRule ^([a-zA-Z0-9_-]+)$ index.php?idl=$1 [L] #not working
Because you are using the same pattern in both rules, the first rule always "wins" and the second rule is never triggered. This is essentially processed as follows (pseudo-code):
if (the URL matches the pattern "^([a-zA-Z0-9_-]+)$") {
rewrite the request to "index.php?idcat=<url>"
}
elseif (the URL matches the pattern "^([a-zA-Z0-9_-]+)$") {
rewrite the request to "index.php?idl=<url>"
}
As you can see, the second code block is never processed since the expressions are the same.
To put it another way, how would you determine whether a request of the form /foo should be rewritten to index.php?idcat=foo or to index.php?idl=foo? You can't rewrite the request to both.
In this particular case you could perhaps rewrite everything to index.php?id=<url> and let your script decide whether it should be idcat or idl. Otherwise, there needs to be something different about the two URLs (and consequently the patterns you are using to match the URLs) that allows you to determine how the URL should be rewritten.
#2 The Rewriterule doesn't work with dash but works with slash and underscore.
RewriteRule ^([a-zA-Z0-9_-]+)-([a-zA-Z0-9_-]+)$ index.php?idl=$1&iddis=$2 [L] #not working
RewriteRule ^([a-zA-Z0-9_-]+)_([a-zA-Z0-9_-]+)$ index.php?idl=$1&iddis=$2 [L] #working
Both these rules have the same problem, depending on the URLs being requested. This is because the patterns/regex you are using are "ambiguous". Each of the two subpatterns (either side of the delimiter), that are used to match the idl and iddis values, contain the same character as the expected delimiter, - or _. However, in the 3rd rule (not shown), you are using a / as the delimiter, which does not occur in the surrounding subpatterns, so there is no ambiguity,
For example, how should (or you would expect) a URL of the form /foo-bar-baz to be matched by the first rule? Since the first subpattern uses the greedy quantifier +, it will capture foo-bar and baz and rewrite the request to index.php?idl=foo-bar&iddis=baz.
To avoid this "ambiguity" you need to make sure the delimiter between the subpatterns (ie. between the values for idl and iddis) is different to the characters used in the subpatterns (or at least one of the two subpatterns).
This can often be resolved by making the regex as specific as possible. ie. Match only the valid characters in idl and iddis.
To begin resolving this issue, you need to first identify the precise URLs you are trying to match, before implementing the rules to match them.

htaccess - how to apply RegEx patterns on the output of another - encapsulation

I'm performing several regular expressions on a string inside a variable in order to clean it up for further use in the htaccess rules, but it seems rather cumbersome to do such simple thing in several lines:
RewriteCond %{THE_REQUEST} (?<=\s)(.*?)(?=\s)
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
RewriteCond %{ENV:HREFPATH} (^.*)?\?
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
RewriteCond %{ENV:HREFPATH} /(.*)
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
RewriteCond %{ENV:HREFPATH} (.*)/$
RewriteRule ^(.*)$ - [E=HREFPATH:%1]
How can I reduce this to 2 lines?
Basically I'm looking for a way to encapsulate each as aggregation steps (filter) based on the output of the previous expression, but my humble efforts have failed after trying and web-searching for hours.
The code above does what I need it to do, it's just really ugly (not elegant).
In PHP, or basically any decent(ish) language it could be as simple as:
$HREFPATH = trim(explode(explode(" ",$THE_REQUEST)[1],"?")[0],"/");
-but this is NOT a PHP-related question; merely a simple way to explain what I mean, and what I'm trying to achieve.
I know there may be many RegEx patterns that could (theoretically) work here, but it should be compatible with Apache's RegEx engine.
Any input will be rewarded in kind; thanks in advance.
What you are doing in multiple rules can be done in a single like this:
RewriteCond %{THE_REQUEST} \s/+([^?]*?)/*[\s?]
RewriteRule ^ - [E=HREFPATH:%1]
RegEx Details:
\s: Match a whitespace
/+: Match 1+ /s
([^?]*?): Lazily match 0 or more of any characters that are not ?. Capture this value in %1
/*: Match 0 or more trailing /s
[\s?]: Must be followed by a ? or a whitespace

mod_rewrite: match string within URL, which regex to chose?

I would like to use mod_rewrite to capture a string within brackets in my URL and do a redirect.
My URL:
something?var_a=A&var_b=(B)&var_c=C
my .httaccess file with the regex:
RewriteEngine on
RewriteRule ^/?.+var_b=\((.*)\)$ somedir/$1 [R]
I just would like to capture what's in between the round brackets, so my redirect should look something like this: somedir/B
I test my regex at http://htaccess.madewithlove.be/ but I get no match.
I don't know what I am missing here, even if I try much simpler regexes, e.g. .+var_b(.*)$ I get no match. Only if my regex was looking for a pattern at the beginning, I get a match, so for example the regex something(.*)$ works.
What am I missing here?
RewriteEngine On
RewriteCond %{QUERY_STRING} (^|&)var_b=\((.*?)\)(&|$) [NC]
RewriteRule ^.*$ somedir/%2? [R]
The reason is that RewriteRule does not receive the ?x=y part of the query. The %2 variable refers to the pattern from the last RewriteCond, while $2 would refer to the pattern from this RewriteRule. The ? at the end prevents the query part ?x=y from being automatically appended at the end of the result.
The (^|&) and (&|$) in the pattern guarantee that var_b=(B) is the complete parameter and not a part of it. Without these, the pattern would also match ?xyzvar_b=(B) or ?var_b=(B)xyz. With these, it will only match ?var_b=(B) or ?a=b&var_b=(B)&x=z etc.

mod_rewrite problems: negation

I'm trying to understand mod_rewrite better and have one particular problem I think I need to get my head round first.
I am rewriting http://www.somesite.tld/a/b/c to index.php?path=a/b/c using the following
RewriteRule ^(?!index.php)(.*)$ index.php?path=$1 [NC,L]
An equivalent rewrite would, in this case, be
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php?path=$1 [NC,L]
This does not work without the RewriteCond -- path=index.php would be the result without specifically ignoring files or saying 'not index.php'. Why is this?
Also, what is the ?! and ?: syntax that I sometimes see used? I do not understand the use of the ? when it is not prefixed by anything.
And why, in the first RewriteRule above, do the second pair of brackets return a match for $1?
Cheers
(?= ...) and (?! ...) is special syntax in Perl regular expressions and in PCRE, which is the regex library that Apache uses. They are, respectively, positive and negative lookahead assertions: they match an empty string if the text after it matches or does not match the content in the brackets.
They are non-capturing, so they don't define any $n (it would be pointless, since they match an empty string). (?: ...) is also non-capturing, it is used to group subexpressions.
Your first rule should work in .htaccess (but not in a virtual host configuration file), though it would be more correct to write it as
RewriteRule ^(?!index\.php$)(.*)$ index.php?path=$1 [L]
Perhaps another rule is interacting with it. You can check what exactly is being matched and rewritten with RewriteLog and RewriteLogLevel.
"!" means negation. Like a = 1 (a is equal one) a != 1 (a is not equal one);
"f" means file. So if you use together with "!", like "!-f" would be something "file does not exist". the links below may help you better:
http://www.askapache.com/htaccess/htaccess.html
http://net.tutsplus.com/tutorials/other/using-htaccess-files-for-pretty-urls/
http://corz.org/serv/tricks/htaccess2.php

mod_rewrite replace '_' with '-'

I'm almost there with a mod_rewrite rule, but I've caved in :)
I need to rewrite
country/[countryname].php
to
country/[countryname]/
however, [countryname] may have an underscore like this: 'south_africa.php' and if it does I want to replace it with a hypen: 'south-africa/'
I also want to match if the country has numbers following it: 'france03.php' to 'france/'
Heres my rule, its almost there but its still adding a hyphen even if there is no second part after the underscore.
RewriteRule ^country/(.*)_(.*?)[0-9]*\.php$ country/$1-$2 [R=301,L]
so currently 'country/south_.php' becomes 'country/south-/'
Can someone please help me find the missing piece of the puzzle? Thanks.
Try this:
RewriteRule ^country/([^_]*)_([^_]*?)\d*\.php$ country/$1-$2 [R=301,L]
This rule will match urls with a single underscore - you'll need a different rule for more underscores or none.
If you want to make sure $2 contains only letter and isn't empty, change ([^_]*?) it to ([a-zA-Z]+).
Alternatively you could do it over several passes:
# If request is for something in "country/"
RewriteCond %{REQUEST_URI} ^country/.+\.php$
# Replace underscore and digits with (single) hyphen
RewriteRule [_0-9]+ \-
# Remove extension (and possible trailing hyphen)
RewriteRule ^(.*)-?\.php$ $1
# Final rewrite
RewriteRule ^country/(.*)$ country/$1 [R=301,L]
Untested ... and not necessarily "pretty" :)