I would like to remove all query strings including parameters and values from URLs with htaccess rules.
Here are a few URLs with query strings as examples which are needed to be removed from the end of URLs.
https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=342
https://example.com/page/62/?option=com_content&view=article&id=91&Itemid=2
https://example.com/page/30/?start=72
https://example.com/other-category-slug/page/12/?add_to_wishlist=9486
https://example.com/other-category-slug/page/15/?add_to_wishlist=9486
https://example.com/other-category-slug/page/4/?orderby=price-desc&add_to_wishlist=332
https://example.com/other-category-slug/page/15/?orderby=price-desc&add_to_wishlist=5736
https://example.com/other- category-slug/page/7/?orderby=popularity
https://example.com/other-category-slug/page/15/?add_to_wishlist=350
https://example.com/category-slug/page/19/?orderby=price-desc
https://example.com/category-slug/page/3/?orderby=date
https://example.com/page/2/?post_type=map
https://example.com/category-slug/page/2/?PageSpeed=noscript
https://example.com/category/page/6/?orderby=menu_order
https://example.com/page/50/?Itemid=wzshaxrogq
https://example.com/category-slug/page/1/?orderby=price&add_to_wishlist=12953
https://example.com/category-slug/this-is-product-slug/?PageSpeed=noscript
https://example.com/category-slug/?add_to_wishlist=15153
https://example.com/page/24/?op
https://example.com/page/68/?iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625
I need clean URLs like these without query strings and parameters.category-slug and product-slug are just examples. I believe i need 5 rules.
https://example.com/category-slug/product-slug/
https://example.com/category-slug/page/15/
https://example.com/category-slug/
https://example.com/page/62/
https://example.com/
Here are a few query strings which I want to keep.
https://example.com/?attachment_id=123
https://example.com/?p=123
https://example.com/page/12/?fbclid=PAAaaK8eCN
https://example.com/your-shopping-cart/?remove_item=22c1acb3539e1aeba2
https://example.com/category-slug/this-is-product-slug/?add-to-cart=29030
https://example.com/?s=%7Bsearch_term_string%7D
Here is my code which is not working. In fact I don't understand the Regex in them.
RewriteEngine On
RewriteRule ^(page/[0-9]+)/.+$ /$1? [L,NC,R=301]
RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$
RewriteRule ^$ /? [L,NC,R=301]
Thanks in advance
Yes , Query strings are exact match
Whilst you've given examples of the URL-path, it looks like you just need to base the match on the query string part of the URL, not the URL-path? Unless the same query string could appear on another URL-path that you would want to keep?
You would only need to focus on the query strings you want to remove, not the ones you want to keep.
I believe i need 5 rules.
It looks like you would need just one rule, but with a lot of conditions (RewriteCond directives). One condition for every query string (since you say they are "exact matches").
RewriteCond %{QUERY_STRING} ^option=.+$ [NC,OR]
RewriteCond %{QUERY_STRING} ^[^=]+$
Although, rather confusingly, you are not attempting an "exact match" at all in your rule, but rather using a generic pattern. (Although you've stated you "don't understand the Regex".)
If you are wanting "exact matches" then you don't need to use regex at all. You can use the = prefix operator on the CondPattern (2nd argument to the RewriteCond directive) to make it an exact (lexicographical) match.
For example, try something like the following instead:
RewriteEngine On
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=342 [OR]
RewriteCond %{QUERY_STRING} =option=com_content&view=article&id=91&Itemid=2 [OR]
RewriteCond %{QUERY_STRING} =start=72 [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=9486 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=332 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc&add_to_wishlist=5736 [OR]
RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=350 [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =post_type=map [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]
RewriteCond %{QUERY_STRING} =Itemid=wzshaxrogq [OR]
RewriteCond %{QUERY_STRING} =orderby=price&add_to_wishlist=12953 [OR]
RewriteCond %{QUERY_STRING} =PageSpeed=noscript [OR]
RewriteCond %{QUERY_STRING} =add_to_wishlist=15153 [OR]
RewriteCond %{QUERY_STRING} =op [OR]
RewriteCond %{QUERY_STRING} =iact=hc&vpx=262&vpy=212&dur=2871&hovh=259&hovw=194&tx=104&ty=131&ei=KJ05TtKZOoi8rAfM2ZmPBQ&page=1&tbnh=129&tbnw=97&start=0&ndsp=35&ved=1t%3A429%2Cr%3A9%2Cs%3A0&doing_wp_cron=1466467271.7778379917144775390625
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]
The above redirects to the same URL-path, but strips the original query string if it matches any of those stated in the preceding conditions.
The QSD flag (Query String Discard) strips the original query string from the request. This is the preferred method on Apache 2.4. However, if you are still on Apache 2.2 then you would need to append an empty query string instead (as you are doing in your existing rule). For example:
RewriteRule ^ %{REQUEST_URI}? [R,L]
Note there is no OR flag on the last RewriteCond directive.
NB: You had included the query string add_to_wishlist=9486 twice in the list of URLs/query strings to remove.
Test first with a 302 (temporary) redirect and only change to a 301 (permanent), if that is the intention, once you have confirmed that it works as intended. 301s are cached persistently by the browser so can make testing problematic.
Make sure the browser cache is cleared before testing.
Combining conditions using regex
Using regex, you could combine several of the conditions. For example, the following 4 conditions could be combined into one:
RewriteCond %{QUERY_STRING} =orderby=popularity [OR]
RewriteCond %{QUERY_STRING} =orderby=price-desc [OR]
RewriteCond %{QUERY_STRING} =orderby=date [OR]
RewriteCond %{QUERY_STRING} =orderby=menu_order [OR]
Is the same as (using regex alternation):
RewriteCond %{QUERY_STRING} ^orderby=(popularity|price-desc|date|menu_order)$ [OR]
UPDATE:
Is it possible to Remove everything (query string and parameters etc) from all URLs with something like * instead of hardcoding each query string?
To remove every query string from every URL (seriously?) then you can do the following (no, you don't use *):
RewriteCond %{QUERY_STRING} .
RewriteRule ^ %{REQUEST_URI} [QSD,R=302,L]
This removes any query string from any URL. The single dot (.) in the CondPattern matches a single character to check that there is a query string.
But this obviously removes the query strings you want to "keep" as well.
The regex character * is a quantifier that repeats the preceding token 0 or more times. (It is not a "wildcard-pattern".) It is not required here. You need to check that the query string is something, not nothing.
There are other options:
Reverse the logic and make exceptions for query strings you want to "keep" and remove the rest. But it depends which is the larger.
Don't match the query strings "exactly". And instead match URL parameter names, with any value.
Related
I have the following regex although it only picks one variable and puts that in user like user contains user/url, how would I modify this to grab the url variable seperately in $2.
RewriteCond %{HTTP_HOST} ^(^.*)\.example.com$ [NC]
RewriteRule ^(.+/?[^/]*)$ http://example.com/index.php?sub=%1&url=$1 [P,NC,QSA,L]
I need this to translate
http://sub.example.com/user/url
to
http://example.com/index.php?sub=%1&user=$1&url=$2
Your regex to capture 2 values from RewriteCond and RewriteRule doesn't seem correct.
You may use:
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{HTTP_HOST} ^([^.]+)\.example\.com$ [NC]
RewriteRule ^([^/]+)(?:/([^/]+))?/?$ http://example.com/index.php?sub=%1&user=$1&url=$2 [P,NC,QSA,L]
I assume you have mod_proxy setup since you're using P flag.
I have hundreds of these old links I need to redirect.
Here is one example:
/index.php?option=com_content&view=article&id=433:seventh-character-code-categories-and-icd-10-cm&Itemid=101&showall=1
to
/seventh-character-code-categories-and-icd-10-cm
Essentially I need to remove the /index.php?option=com_content&view=article&id=433: part.
I tried this but I am getting confused with the [0-9] and : parts, so the following does not work:
RewriteRule ^/index.php?option=com_content&view=article&id=[0-9]:(.*)$ /$1 [L,R=301]
Say you want to capture from after : to right before & in the query string you mentioned, then try this expression:
^[^\:]*\:([^\&]*)\&.*$
As #starkeen mentioned in comments, you got to check against the query string. This can be done using RewriteCond %{QUERY_STRING}
So if index.php is in the root folder:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^\/index\.php$
RewriteCond %{QUERY_STRING} ^[^\:]*\:([^\&]*)\&.*$
RewriteRule ^(.*)$ http://example.com/%1 [R=301,L]
Here's another example. This one is for a sub folder:
RewriteEngine On
RewriteCond %{REQUEST_URI} ^\/pages\/index\.php$
RewriteCond %{QUERY_STRING} ^[^\:]*\:([^\&]*)\&.*$
RewriteRule ^(.*)$ /pages/%1? [R=301,L]
Also, notice the ? at the end of the url /pages/%1?, this prevents from re-attaching the query string.
Another thing, captured groups will be set to variables %{number} since set in the RewriteCond.
BTW, depending on your server's configuration, you may need to add the NE flag, like [NE,L,R=301] Plus test whether it is necessary to double escape the literal characters.
what is about direct approach. Skip all till semicolon, mach string till & and replace all with first much
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{QUERY_STRING} [^:]+:([\w-]+[^&]).*
RewriteRule .*$ \/%1? [R=301,L]
</IfModule>
I've got a problem with some website URLs which I want htaccess to redirect after removing a few query string parameters, for example:
http://www.mywebsite.com/archive?s=200&dis=default&opt=foo
http://www.mywebsite.com/archive?dis=foo&opt=baz
or
http://www.mywebsite.com/archive?type=default&format=rss
http://www.mywebsite.com/archive?pg=3&format=rss&type=default
I want to save all the parameters except for type, format, dis or opt which are causing a 404 error. I've found a way to remove a single parameter, but I still can't find a regex or something to remove multiple query parameters.
This is my code so far:
RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*)&?view=[^&]+&?(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)&?opt=[^&]+&?(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)&?type=[^&]+&?(.*)$ [OR]
RewriteCond %{QUERY_STRING} ^(.*)&?format=[^&]+&?(.*)$
RewriteRule ^/?(.*)$ /$1?%1%2 [R=301,L]
Which doesn't work because it removes just a single parameter and saves the others that are causing errors.
P.S. As you can see, it should work only on 'archive' page, but that's not a problem :)
UPDATE
This is an URL that I'm testing at the moment:
http://www.mywebsite.com/archive?foo=0&force=0&format=feed&type=rss
Which I want to be like this:
http://www.mywebsite.com/archive?foo=0&force=0
RE-UPDATED
By using collapsar's answer, the server's error_log shows this:
Invalid command '<If', perhaps misspelled or defined by a module not included in the server configuration
Discussion
Unfortunately, the query string portion of urls is excluded from rewriting by default. the RewriteRule directive does not match against the query string portion. Any Query string needs to be appended expressly in the substitution string.
This implies that the rewriting cannot be accomplished without resorting to RewriteCond directives ( fwiw, that is why the previous versions of this answer have been wrong ).
RewriteRule performs the actual rewriting after any one of RewriteCond patterns linked with the OR flag matches. This implies that the set of conditions will not be exhaustively tested.
Solution
Adjust your rule set as follows:
RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])format=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3
RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])opt=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3
RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])type=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3
RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*?)([&?])view=[^&]+&?(.*)$
RewriteRule ^(.*)$ $1?%1%2%3
RewriteCond %{REQUEST_URI} ^.*/archive
RewriteCond %{QUERY_STRING} ^(.*[?])breaktheloop=1(.*)$
RewriteRule .? - [S=1]
RewriteRule ^(.*)$ $1?breaktheloop=1 [QSA,R=301,L]
RewriteRule ^(.*)$ $1?%1%2 [L]
The RewriteCond patterns take into account that improperly escaped urls may be the value of some parameter in the query string. Drop the non-greedy matching modifier (ie. use ^(.*) instead of ^(.*?)) if you are not concerned about this.
Synopsis
The differences to the OP's original solution are:
individual substitution of each offending parameter
including the parameter separator in the substitution pattern
catering for improperly escaped urls as a parameter value
individual rewriting (copy) rule to trigger redirection after any number of substitutions have applied. QSA flag is necessary to keep the sanitized query string.
breaktheloop parameter to, well, break the redirection loop.
Documentation
The respective section of the Apache httpd directive docs is the place to find more detailed information.
I have two url's I'm trying to rewrite, for the past... 4-5 hours (headache now).
I am trying to rewrite
/arts/tag/?tag=keyword
to
/search/art?keywords=keyword
Looking at other questions I formulated my rewrite like this
RewriteRule /arts/tag/?tag=([^&]+) search/art?keywords=$1 [L,R=301,NC]
and
RewriteRule ^arts/tag/?tag=$ /search/art\?keywords=%1? [L,R=301,NC]
I tried with backslashes and without, no luck.
Also tried
RewriteCond %{QUERY_STRING} /arts/tag/?tag=([^&]+) [NC]
RewriteRule .* /search/art\?keywords=%1? [L,R=301,NC]
The second one is similar,
/arts/category?id=1&sortby=views&featured=1
to
/art/moved?id=1&rearrange=view
The reason I change the get variable name is for my own learning purpose as I haven't found any tutorials for my purpose. I also changed category to moved since the categories have changed and I have to internally redirect some ID #'s.
RewriteCond %{QUERY_STRING} id=([^&]+) [NC] // I need the path in there though, not just query string, since I'll be redirecting /blogs/category and /art/category to different places.
RewriteRule .* /art/moved/id=%1? [L,R=301,NC]
Any help will be appreciated. Thank you.
Assuming the queries in the original URLs have nothing in common with those in the substitution URLs, maybe this will do what you want, using the first keyin the query as a condition and to identify the incoming URL:
RewriteEngine On
RewriteBase /
# First case
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{QUERY_STRING} \btag\b
RewriteRule .* http://example.com/search/art?keywords=keyword? [L]
Will map this:
http://example.com/arts/tag/?tag=keyword
To this:
http://example.com/search/art?keywords=keyword
# Second case
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{QUERY_STRING} \bid\b
RewriteRule .* http://example.com/art/moved?id=1&rearrange=view? [L]
Will map this:
http://example.com/arts/category?id=1&sortby=views&featured=1
To this:
http://example.com/art/moved?id=1&rearrange=view
Both are mapped silently. If the new URL is to be shown in the browser's address bar modify the flags like this [R,L]. Replace R with R=301 for a permanent redirect.
The current RewriteRule removes any query except query callback for any URL.
# Remove question mark and parameters
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^?#\ ]*)\?[^\ ]*\ HTTP/ [NC]
# Query rewrite exceptions
##RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api.*?callback=.* #does not work
RewriteCond %{QUERY_STRING} !callback=
RewriteRule .*$ %{REQUEST_URI}? [R=301,L]
How to avoid query callback rewrite just from URL ^api\/?([^\/]*)$? Excepted result:
no rewrite for /api?callback=1, /api/user?callback=1, /api/user/2?callback=1
rewrite for /apis?callback=1, /user?callback=1, /api/user?foo=1 etc.
I finally understood your question...
Replace these lines:
##RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api.*?callback=.* #does not work
RewriteCond %{QUERY_STRING} !callback=
with this line:
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api(/.*)?\?callback=.*
Important notice:
if your script isn't located in the document root, but, i.e., in dir /htest,
and full URL looks like mine: http://localhost/htest/api/?callback=1, then you have to put full path to API in your RewriteCond:
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/htest/api(/.*)?\?callback=.*
You can overcome that by beginning your regex with !/api instead of ^/path/to/api, but /foo/api and /bar/api will be skipped from rewriting too.
Update:
this .htaccess works fine in document root:
RewriteEngine On
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^?#\ ]*)\?[^\ ]*\ HTTP/ [NC]
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api(/.*)?\?callback=.*
RewriteRule .*$ %{REQUEST_URI}? [R=temporary,L]
you may try using it without any other rules to check what is wrong
Update 2
If you have other condition, i.e.,
RewriteRule ^([^.]*)$ index.php?url=$1 [QSA,L]
repeat RewriteCond before it:
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !^/api(/.*)?\?callback=.*
RewriteRule ^([^.]*)$ index.php?url=$1 [QSA,L]
also to be able to use these rules in /foo subdir, replace ^/api with ^/foo/api
To enable RewriteRule for index.php, need to add in query rewrite exceptions.
This rules works fine and fixes this issue:
# Remove question mark and parameters
RewriteCond %{QUERY_STRING} .+
# Query rewrite exceptions
RewriteCond %{REQUEST_FILENAME} !index.php
RewriteCond %{REQUEST_URI}?%{QUERY_STRING} !/api(/.*)?\?callback=.*
RewriteRule .*$ %{REQUEST_URI}? [R=301,L]