I have a rewrite condition that rewrites /myPage.php?myQueryVar=foo-aRandomString to /myNewPage/foo-aRandomString. I only want this to apply in instances where there is a hyphen in the query value therefore I have some conditions in place as seen below:
RewriteCond %{QUERY_STRING} (^|&)myQueryVar=foo-(.*)($|&)
RewriteCond %{THE_REQUEST} ^GET\ /myPage\.php\?myQueryVar=([^\s&]+) [NC]
RewriteRule ^myPage\.php$ /myNewPage/%1? [R=301,L]
I'd like to add another rule exception allowing /myPage.php?myQueryVar=bar-aRandomString. Currently I've had to simply cloine the above code and use it again but changing foo to bar as sen below. Is there a cleaner way of doing this without having to have multiple line of near identical code? Thank you.
RewriteCond %{QUERY_STRING} (^|&)myQueryVar=bar-(.*)($|&)
RewriteCond %{THE_REQUEST} ^GET\ /myPage\.php\?myQueryVar=([^\s&]+) [NC]
RewriteRule ^myPage\.php$ /myNewPage/%1? [R=301,L]
Try to use this:
RewriteCond %{QUERY_STRING} (^|&)myQueryVar=(foo|bar)-(.*)($|&)
RewriteCond %{THE_REQUEST} ^GET\ /myPage\.php\?myQueryVar=([^\s&]+) [NC]
RewriteRule ^myPage\.php$ /myNewPage/%1? [R=301,L]
(The first RewriteCond was changed)
Basically, foo was changed to this (foo|bar) which means:
( # begin of group
foo # a literal 'foo'
| # or
bar # a literal 'bar'
) # end of group
Bear in mind we had to enclose the two options within a group, since if not, the regex would have meant instead:
(^|&)myQueryVar=foo|bar-(.*)($|&) ==> (^|&)myQueryVar=foo OR bar-(.*)($|&)
Also, if you don't want to capture what is inside the parenthesis (that's why they are called 'capturing groups') You may use 'non-capturing groups' instead (?:). Using non-capturing groups is a good practice if you don't actually need to capture the inner data.
Also, you don't need the group holding the .* on the first RewriteCond, since you really use the capturing group of the second RewriteCond
So the rules could be changed like this:
RewriteCond %{QUERY_STRING} (?:^|&)myQueryVar=(?:foo|bar)-.*(?:$|&)
RewriteCond %{THE_REQUEST} ^GET\ /myPage\.php\?myQueryVar=([^\s&]+) [NC]
RewriteRule ^myPage\.php$ /myNewPage/%1? [R=301,L]
But even so, you could just one one single RewriteCond by joining the two regexes:
RewriteCond %{THE_REQUEST} ^GET\ /myPage\.php\?myQueryVar=((?:foo|bar)-[^\s&]+) [NC]
RewriteRule ^myPage\.php$ /myNewPage/%1? [R=301,L]
(note that the (?:foo|bar)is a non-capturing group, since you don't need to capture just that part, you need to capture the variable as a whole
Related
I would like to remove all query strings including parameters and values from URLs with htaccess rules.
Here are a few URLs with query strings as examples which are needed to be removed from the end of URLs.
https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=342
https://example.com/page/62/?option=com_content&view=article&id=91&Itemid=2
https://example.com/page/30/?start=72
https://example.com/other-category-slug/page/12/?add_to_wishlist=9486
https://example.com/other-category-slug/page/15/?add_to_wishlist=9486
https://example.com/other-category-slug/page/4/?orderby=price-desc&add_to_wishlist=332
https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=5736
https://example.com/other- category-slug/page/7/?orderby=popularity
https://example.com/other-category-slug/page/15/?add_to_wishlist=350
https://example.com/category-slug/page/19/?orderby=price-desc
https://example.com/category-slug/page/3/?orderby=date
https://example.com/page/2/?post_type=map
https://example.com/category-slug/page/2/?PageSpeed=noscript
https://example.com/category/page/6/?orderby=menu_order
https://example.com/page/50/?Itemid=wzshaxrogq
https://example.com/category-slug/page/1/?orderby=price&add_to_wishlist=12953
https://example.com/category-slug/this-is-product-slug/?PageSpeed=noscript
https://example.com/category-slug/?add_to_wishlist=15153
https://example.com/page/24/?op
https://example.com/page/68/?iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625
I need clean URLs like these without query strings and parameters.category-slug and product-slug are just examples. I believe i need 5 rules.
https://example.com/category-slug/product-slug/
https://example.com/category-slug/page/15/
https://example.com/category-slug/
https://example.com/page/62/
https://example.com/
Here are a few query strings which I want to keep.
https://example.com/?attachment_id=123
https://example.com/?p=123
https://example.com/page/12/?fbclid=PAAaaK8eCN
https://example.com/your-shopping-cart/?remove_item=22c1acb3539e1aeba2
https://example.com/category-slug/this-is-product-slug/?add-to-cart=29030
https://example.com/?s=%7Bsearch_term_string%7D
Here is my code which is not working. In fact I don't understand the Regex in them.
RewriteEngine On
RewriteRule ^(page/[0-9]+)/.+$ /$1? [L,NC,R=301]
RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$
RewriteRule ^$ /? [L,NC,R=301]
Thanks in advance
Yes , Query strings are exact match
Whilst you've given examples of the URL-path, it looks like you just need to base the match on the query string part of the URL, not the URL-path? Unless the same query string could appear on another URL-path that you would want to keep?
You would only need to focus on the query strings you want to remove, not the ones you want to keep.
I believe i need 5 rules.
It looks like you would need just one rule, but with a lot of conditions (RewriteCond directives). One condition for every query string (since you say they are "exact matches").
RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$
Although, rather confusingly, you are not attempting an "exact match" at all in your rule, but rather using a generic pattern. (Although you've stated you "don't understand the Regex".)
If you are wanting "exact matches" then you don't need to use regex at all. You can use the = prefix operator on the CondPattern (2nd argument to the RewriteCond directive) to make it an exact (lexicographical) match.
For example, try something like the following instead:
RewriteEngine On
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=342 [OR]
RewriteCond %{QUERY_STRING} =option=com_content&view=article&id=91&Itemid=2 [OR]
RewriteCond %{QUERY_STRING} =start=72 [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=9486 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=332 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=5736 [OR]
RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=350 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =post_type=map [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]
RewriteCond %{QUERY_STRING} =Itemid=wzshaxrogq [OR]
RewriteCond %{QUERY_STRING} =orderby=price&add_to_wishlist=12953 [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=15153 [OR]
RewriteCond %{QUERY_STRING} =op [OR]
RewriteCond %{QUERY_STRING} =iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]
The above redirects to the same URL-path, but strips the original query string if it matches any of those stated in the preceding conditions.
The QSD flag (Query String Discard) strips the original query string from the request. This is the preferred method on Apache 2.4. However, if you are still on Apache 2.2 then you would need to append an empty query string instead (as you are doing in your existing rule). For example:
RewriteRule ^ %{REQUEST_URI}? [R,L]
Note there is no OR flag on the last RewriteCond directive.
NB: You had included the query string add_to_wishlist=9486 twice in the list of URLs/query strings to remove.
Test first with a 302 (temporary) redirect and only change to a 301 (permanent), if that is the intention, once you have confirmed that it works as intended. 301s are cached persistently by the browser so can make testing problematic.
Make sure the browser cache is cleared before testing.
Combining conditions using regex
Using regex, you could combine several of the conditions. For example, the following 4 conditions could be combined into one:
RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]
Is the same as (using regex alternation):
RewriteCond %{QUERY_STRING} ^orderby=(popularity|price-desc|date|menu_order)$ [OR]
UPDATE:
Is it possible to Remove everything (query string and parameters etc) from all URLs with something like * instead of hardcoding each query string?
To remove every query string from every URL (seriously?) then you can do the following (no, you don't use *):
RewriteCond %{QUERY_STRING} .
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]
This removes any query string from any URL. The single dot (.) in the CondPattern matches a single character to check that there is a query string.
But this obviously removes the query strings you want to "keep" as well.
The regex character * is a quantifier that repeats the preceding token 0 or more times. (It is not a "wildcard-pattern".) It is not required here. You need to check that the query string is something, not nothing.
There are other options:
Reverse the logic and make exceptions for query strings you want to "keep" and remove the rest. But it depends which is the larger.
Don't match the query strings "exactly". And instead match URL parameter names, with any value.
I have hundreds of these old links I need to redirect.
Here is one example:
/index.php?option=com_content&view=article&id=433:seventh-character-code-categories-and-icd-10-cm&Itemid=101&showall=1
to
/seventh-character-code-categories-and-icd-10-cm
Essentially I need to remove the /index.php?option=com_content&view=article&id=433: part.
I tried this but I am getting confused with the [0-9] and : parts, so the following does not work:
RewriteRule ^/index.php?option=com_content&view=article&id=[0-9]:(.*)$ /$1 [L,R=301]
Say you want to capture from after : to right before & in the query string you mentioned, then try this expression:
^[^\:]*\:([^\&]*)\&.*$
As #starkeen mentioned in comments, you got to check against the query string. This can be done using RewriteCond %{QUERY_STRING}
So if index.php is in the root folder:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^\/index\.php$
RewriteCond %{QUERY_STRING} ^[^\:]*\:([^\&]*)\&.*$
RewriteRule ^(.*)$ http://example.com/%1 [R=301,L]
Here's another example. This one is for a sub folder:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^\/pages\/index\.php$
RewriteCond %{QUERY_STRING} ^[^\:]*\:([^\&]*)\&.*$
RewriteRule ^(.*)$ /pages/%1? [R=301,L]
Also, notice the ? at the end of the url /pages/%1?, this prevents from re-attaching the query string.
Another thing, captured groups will be set to variables %{number} since set in the RewriteCond.
BTW, depending on your server's configuration, you may need to add the NE flag, like [NE,L,R=301] Plus test whether it is necessary to double escape the literal characters.
what is about direct approach. Skip all till semicolon, mach string till & and replace all with first much
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{QUERY_STRING} [^:]+:([\w-]+[^&]).*
RewriteRule .*$ \/%1? [R=301,L]
</IfModule>
I'm trying to append the subdomain as a query to eventual existing queries using htaccess.
http://test.domain.com should be http://test.domain.com?x=test
http://test.domain.com?id=1 should be http://test.domain.com?id=1&x=test
This is what I have done, but it doesn't work and I can figure out why:
RewriteCond %{HTTP_HOST} ^([a-z0-9_-]+)\.domain\.com$ [NC]
// exclude www.domain.com
RewriteCond %1 !^(www)$ [NC]
RewriteRule ^[^\?]*(?:\?(.*))?$ index.php?$1&x=%1 [L]
My understanding was
[^\?]* all characters except ?, match 0 or more times
(?: start of a non capturing group
\? match ? literally
(.*) all characters after ? as a group
)? end of the non capturing group, match 0 or 1 times
But it does not work. Where is my mistake?
UPDATE 1:
I could make it work by using the following rule
RewriteRule (.*) index.php?$1&x=%1 [QSA,L]
http://test1.domain.com?y=test1 brings me [x=>test1,y=>test2]
but
http://test1.domain.com?y=test1&x=test3 brings me [x=>test3,y=>test2]
So it overrides my x value. Is there a way to block that?
UPDATE 2
This is the code I'm using now:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(?!www)([\w-]+)\. [NC]
RewriteCond %1::%{QUERY_STRING} !^(.+?)::x=\1(?:&|$) [NC]
RewriteRule ^ index.php?%{QUERY_STRING}&x=%1 [L]
Try this rule:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(?!www)([\w-]+)\. [NC]
RewriteCond %1::%{QUERY_STRING} !^(.+?)::x=\1(?:&|$) [NC]
RewriteRule ^ index.php?%{QUERY_STRING}&x=%1 [L]
Make sure this is the only rule you have in .htaccess while testing.
Explanation of:
RewriteCond %{HTTP_HOST} ^(?!www)([\w-]+)\. [NC]
RewriteCond %1::%{QUERY_STRING} !^(.+?)::x=\1(?:&|$) [NC]
We are capturing starting part of hostname from this group: ([\w-]+) which is denoted by %1. Note that we cannot use %1 in RHS of a condition.
We are then appending %1 and %{QUERY_STRING} together in %1::%{QUERY_STRING}. Here we could use any other arbitrary delimiter like ## as well.
In RHS we have ^(.+?)::x=\1(?:&|$) which means %1 followed by delimiter :: followd by literal x= and then \1 which is back-reference for %1 (goup before ::). ! before ^ is there to negate the condition. In simple words this condition means execute this rule only if we already don't have x=subdomain in QUERY_STRING.
Looks like you are trying to match the query string content with your RewriteRule’s pattern – that is not possible, it searches only the path component of the requested URL.
But, no worries – there’s an easy solution that helps combine the original query string, and what the pattern matched: The QSA flag.
So this should do the trick (combined with your existing RewriteConds):
RewriteRule (.*) index.php?$1 [QSA,L]
I have the following rewrite rule in order to control my different international domains to redirect to the main domain.
RewriteCond %{HTTP_HOST} !^www..*
RewriteCond %{HTTP_HOST} !^$
RewriteCond %{HTTP_HOST} ^([^.]*).(ru|co.in|in|de|com.br|co.uk|ca|com|com/)
RewriteRule ^.*$ http://www.[percent]1.[percent]2[percent]{REQUEST_URI} [R=301,L]
This has been working for the past few years.
Today, when I try to create a domain alias that contains one of the letters above, for example: tvonline.domain.com, it redirects to tvon.in. Basically happens with any alias that contains the letters in, ru, de, ca.
Is there something I can do about this?
Thanks!
There are several issues with the pattern matching, but the problem is likely in the line matching your international TLDs. Here is the issue on each line:
The . is a wildcard so you are going to get a negative match on www.domain.com but also wwwxxx.domain.com with the * to match 0 or more of any character.
The %{HTTP_HOST} should never be empty.
The . is a wildcard for any character and you aren't exclusively matching the end of the %{HTTP_HOST} with $. Use the a ? to make the first pattern ungreedy. You don't need to match on co.in because it will be matched by in.
I'm guessing that the [percent] in your example is really %, which is what it should be.
Try the following in place of what you have now:
RewriteCond %{HTTP_HOST} !^www\.
RewriteCond %{HTTP_HOST} ^(.*?)\.(ru|in|de|com\.br|co\.uk|ca|com|com)$
RewriteRule ^.*$ http://www.%1.%2%{REQUEST_URI} [R=301,L]
Testing using http://htaccess.madewithlove.be/:
Rewrite:
Input URL: http://tvonline.domain.com/test.html
1. RewriteCond %{HTTP_HOST} !^www\.
This condition was met
2. RewriteCond %{HTTP_HOST} ^(.*?)\.(ru|in|de|com\.br|co\.uk|ca|com|com)$
This condition was met
3. RewriteRule ^.*$ http://www.%1.%2%{REQUEST_URI} [R=301,L]
This rule was met, the new url is http://www.tvonline.domain.com/test.html
The tests are stopped, using a different host will cause a redirect
Output URL: http://www.tvonline.domain.com/test.html
No Rewrite:
Input URL: http://www.tvonline.domain.com/test.html
1. RewriteCond %{HTTP_HOST} !^www\.
This condition was not met
2. RewriteCond %{HTTP_HOST} ^(.*?)\.(ru|in|de|com\.br|co\.uk|ca|com|com)$
This condition was met
3. RewriteRule ^.*$ http://www.%1.%2%{REQUEST_URI} [R=301,L]
This rule was not met because one of the conditions was not met
Thanks! This led me at the right direction in order to solve this issue. This is what I used to get it to work.
RewriteCond %{HTTP_HOST} !^www\.
RewriteCond %{HTTP_HOST} ^([^.]*?).(ru|in|de|com\.br|co\.uk|ca|com|com)$
RewriteRule ^.*$ http://www.%1.%2%{REQUEST_URI} [R=301,L]
The current RewriteRule removes any query except query callback for any URL.
# Remove question mark and parameters
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^?#\ ]*)\?[^\ ]*\ HTTP/ [NC]
# Query rewrite exceptions
##RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api.*?callback=.* #does not work
RewriteCond %{QUERY_STRING} !callback=
RewriteRule .*$ %{REQUEST_URI}? [R=301,L]
How to avoid query callback rewrite just from URL ^api\/?([^\/]*)$? Excepted result:
no rewrite for /api?callback=1, /api/user?callback=1, /api/user/2?callback=1
rewrite for /apis?callback=1, /user?callback=1, /api/user?foo=1 etc.
I finally understood your question...
Replace these lines:
##RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api.*?callback=.* #does not work
RewriteCond %{QUERY_STRING} !callback=
with this line:
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api(/.*)?\?callback=.*
Important notice:
if your script isn't located in the document root, but, i.e., in dir /htest,
and full URL looks like mine: http://localhost/htest/api/?callback=1, then you have to put full path to API in your RewriteCond:
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/htest/api(/.*)?\?callback=.*
You can overcome that by beginning your regex with !/api instead of ^/path/to/api, but /foo/api and /bar/api will be skipped from rewriting too.
Update:
this .htaccess works fine in document root:
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^?#\ ]*)\?[^\ ]*\ HTTP/ [NC]
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api(/.*)?\?callback=.*
RewriteRule .*$ %{REQUEST_URI}? [R=temporary,L]
you may try using it without any other rules to check what is wrong
Update 2
If you have other condition, i.e.,
RewriteRule ^([^.]*)$ index.php?url=$1 [QSA,L]
repeat RewriteCond before it:
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api(/.*)?\?callback=.*
RewriteRule ^([^.]*)$ index.php?url=$1 [QSA,L]
also to be able to use these rules in /foo subdir, replace ^/api with ^/foo/api
To enable RewriteRule for index.php, need to add in query rewrite exceptions.
This rules works fine and fixes this issue:
# Remove question mark and parameters
RewriteCond %{QUERY_STRING} .+
# Query rewrite exceptions
RewriteCond %{REQUEST_FILENAME} !index.php
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !/api(/.*)?\?callback=.*
RewriteRule .*$ %{REQUEST_URI}? [R=301,L]