REGEX: Extract parameter from URL - regex

What REGEX should I use to extract the following parameter from a url string:
/?c=135&a=1341
I basically want to get the value of the a parameter from this string.
Thanks,

If you want to extract the value of a, and the value consists of one to many digits, this regex should work:
preg_match("/a=(\\d{1,})/ui", $_SERVER['REQUEST_URI'], $matches)
Then use $matches[1] to display the a value

I am going to answer a slightly more general Q which is suggested by your ? prefix that you are trying to remove a specific parameter from a URI request string (which drops the leading ?). And in this case using the mod_rewrite engine so that you can implement this in your .htaccess file.
The rule is somewhat more complex because you don't necessarily know where in the query parameters a=XXX comes, so you need different regexps for the case where a is first and a is a subsequent parameter. You do this by ((?=a=)regexp1|regexp2) so here it is:
RewriteEngine on
RewriteBase \
RewriteCond %{QUERY_STRING} ^(?(?=a=)a=[^&]*&?(.*)|(.*)&a=[^&]*(&.*)?)
RewriteRule ^.* $0?%1%2%3 [L]
If a is first the %1 contain rest otherwise %2 and %3 the bookends (%3 may be blank).
If you want this to occur for specific scripts then replace the rule regexp ^.* by a more specific one.
Enjoy :-)

Related

Redirect the URL from one query string to another

I have spent a great many hours trying to find a solution to this and tried many different approaches but nothing I have tried has worked so far.
I would like to redirect a URL with a query string to another URL that contains the value of that query string.
I want to redirect:
https://example.com/component/search/?searchword=XXXXXXXXX&searchwordsugg=&option=com_search
to
https://example.com/advanced-search?search=XXXXXXXXX
You can do something like the following using mod_rewrite at the top of your root .htaccess file:
RewriteEngine On
RewriteCond %{QUERY_STRING} (?:^|&)searchword=([^&]*)
RewriteRule ^component/search/?$ /advanced-search?search=%1 [NE,R=302,L]
The RewriteRule pattern matches against the URL-path only, which notably excludes the query string. To match against the query string we need a separate condition that checks against the QUERY_STRING server variable.
%1 is a backreference to the first capturing group in the preceding CondPattern, ie. the value of the searchworld URL parameter.
The regex (?:^|&)searchword=([^&]*) matches the searchworld URL parameter anywhere in the query string, not just at the start (as in your example). This also permits an empty value for the URL parameter.
The NE flag is required to prevent the captured URL parameter value being doubly encoded in the response. (Since the QUERY_STRING server variable is not %-decoded.)
The L flag prevents further processing during this pass of the rewrite engine.
Reference:
Apache docs: RewriteRule Directive
Apache docs: RewriteCond Directive

Unknown number of regex replacements, how?

I need to change a large number of URIs in the following way:
substitute %20 separators with dashes -,
substitute the old root with a new domain.
Examples:
/old_root/first/second.html -> http://new_domain.com/first/second
/old_root/first/second%20third.html -> http://new_domain.com/first/second-third
/old_root/first/second%20third%20fourth.html -> http://new_domain.com/first/second-third-fourth
The best I came up with using regex is to write as many pattern-replacement rules as the maximum number of %20 separators that can occur in my URIs:
old_root/(.*?)/(.*?)\.html -> http://new_domain.com/$1/$2
old_root/(.*?)/(.*?)%20(.*?)\.html -> http://new_domain.com/$1/$2-$3
old_root/(.*?)/(.*?)%20(.*?)%20(.*?)\.html -> http://new_domain.com/$1/$2-$3-$4
My question is: is it possible to obtain the same result using a single regular expression rule?
I thought I could use a two-step approach: first change all %20 separators to - and then use the rule old_root/(.*?)/(.*?)\.html -> http://new_domain.com/$1/$2/. However, I need to apply this rule in a .htaccess file as a RedirectMatch directive and, as far as I know, it is not possible to use two successive rules for the same redirect directive.
It turns out that Apache recursively applies all regex rules until they stop matching. Therefore, I am allowed to write two rules rather than one to solve my problem.
The following rules do what I was looking for, and more; I have tested them on my local Apache server and they work fine. Note that for them to work, you need to first turn on the rewrite engine by prepending
RewriteEngine on
Options +FollowSymlinks -MultiViews
in the local .htaccess file or in the global httpd.conf file.
Replace all spaces with hyphens
Replace both literal spaces and %20 with hyphens:
RewriteRule ^(.+)(\s|%20)(.+)$ /$1-$3 [R=301,NE,L]
Replace all apostrophes with hyphens
Replace all literal apostrophes and %60 with hyphens:
RewriteRule ^(.+)('|`|%60)(.+)$ /$1-$3 [R=301,NE,L]
Delete the trailing .html extension
RewriteRule (.+)\.html$ $1 [R=301,L]
Convert the last field in the URL to lower case
Convert the last field in the URL to lower case and prepend the new domain:
RewriteRule /whatever/(.*?)/(.*?)/(.*) http://new.domain.com/$1/$2/${lc:$3} [R=301,L]
Important: The lowercase conversion will only work if you include the following lines at the end of the Apache configuration file httpd.conf, which is usually located in the etc directory on the server:
RewriteEngine on
RewriteMap lc int:tolower
A last note: I recommend prepending each rule with a RewriteCond directive to restrict the field of application of the rule. For example, to apply the space-to-hyphens rule only to those URI that match a certain regex, you should write the following in your .htaccess file:
RewriteCond %{REQUEST_URI} regex_for_URIs
RewriteRule ^(.+)(\s|%20)(.+)$ /$1-$3 [R=301,NE,L]
where regex_for_URIs is the regular expression that the URI must match in order for the next RewriteRule to be applied; it can also be a simple string.
Well, you were almost done.
Problems
Don't return "%20" - We'll Use them as "delimiter" of parts of the path
Add condition on third & fourth group (because you might pass short URL i.e. your examples)
Solution
\/old_root\/(.*?)\/(\w*)(?:%20)?(\w*)?(?:%20)?(\w*)?\.html
See Demo
Explanation
(?:%20)? means "%20" is non catching group that can occurs 0 or 1 time.
Same logic applyies on possible 3rd & 4th part.

Regex for filename and querystring from url in 2 groups

I'm trying to write a mod_rewrite rule using a regular expression, and I'm a bit green as to some of the processes involved.
I believe I can do what I want if I can figure out how to get this regular expression right.
String is http://www.a.com/b.css?v=1234
I know I can get b.css?v=1234 with the regex
([^\/]+$)
What I'm looking for is it grouped so that %1 is b.css and %2 is 1234. Any help is appreciated. Thanks.
Based on the url you provided:
http://www.a.com/b.css?v=1234
You can use:
/(\w+\.\w{3})\?v=(\d+)
Debuggex Demo
For java remember to escape backslashes to:
/(\\w+\\.\\w{3})\\?v=(\\d+)
Hope to help
You need both a Condition and one or more Rules.
One of several ways to do it, tested on Apache 2.2 and 2.4:
RewriteCond %{QUERY_STRING} v=(\d+)
RewriteRule ^([^/]*) DoSomethingWithFile_$1_AndDigits_%1?
Input url: www.yoursite.com/b.css?v=1234
%1 contains 1234
$1 contains b.css
Rewritten url: www.yoursite.com/DoSomethingWithFile_b.css_AndDigits_1234

Htaccess regex to exclude everything except one string

I have done this: http://rubular.com/r/AHI15Tb4ju, and it match the second url (http://gamempire.localhost.it/news/tomb-pc), but I want to exclude that url and match everything that do not have the word "news/" inside (but at the same time end in the way that I have specified).
How to do that?
Basically, i want to match only the third url (http://gamempire.localhost.it/tomb-pc).
Thanks!
You can use a rule like this:
RewriteEngine On
RewriteCond %{REQUEST_URI} !/news/
RewriteRule -(?:pc|ps2|ps3|ps4|xbox-360|xbox-one|xbox|wii-u|wii|psp|ps-vita|ds|3ds|iphone|ipad|android|playstation)(.*)$ / [L,R]
Since I didn't know any action part I just redirected these matching URI patterns to / that you can change according to your need.
Try using this:
^((?!news).)*-(?:pc|ps2|ps3|ps4|xbox-360|xbox-one|xbox|wii-u|wii|psp|ps-vita|ds|3ds|iphone|ipad|android|playstation)(.*)$
It should be noted that I tried to modify your original pattern as little as possible, assuming you also needed the (.*) at the end even though it appears that this is unnecessary for your purposes, and would match strings such as
"http://gamempire.localhost-pc.it/tomb" and "http://-pcgamempire.localhost.it/tomb".

How to rewrite this URL to a redirect page?

I am using Microsoft-IIS/7.5 on a hosted server (Hostek.com)
I have an existing site with 2,820 indexed links in Google. You can see the results by searching Google with this: site:flyingpiston.com Most of the pages use a section, makerid, or bikeid to get the right information. Most of the links look like this:
flyingpiston.com/?BikeID=1068
flyingpiston.com/?MakerID=1441
flyingpiston.com/?Section=Maker&MakerID=1441
flyingpiston.com/?Section=Bike&BikeID=1234
On the new site, I am doing URL rewriting using .htaccess. The new URLs will look like this:
flyingpiston.com/bike/1068/
flyingpiston.com/maker/1123/
Basically, I just want to use my htaccess file to direct any request with a "?" question mark in it directly a coldfusion page called redirect.cfm. On this page, I will use ColdFusion to write a custom 301 redirect. Here's what ColdFusion's redirect looks like:
<cfheader statuscode="301" statustext="Moved Permanently">
<cfheader name="Location" value="http://www.newurl/bike/1233/">
<cfabort>
So, what does my htaccess file need to look like if I want to push everything with a question mark to a particular page? Here's what I have tried, but it's not working.
RewriteEngine on
RewriteRule ^? /redirect.cfm [NS,L]
Update. Using the advice from below, I am using this rule:
RewriteRule \? /redirect/redirect.cfm [NS,L]
To try to push this request
http://flyingpiston2012-com.securec37.ezhostingserver.com/?bikeid=1235
To this page:
http://flyingpiston2012-com.securec37.ezhostingserver.com/redirect/redirect.cfm
There's a couple of reasons what you're trying isn't working.
The first one is that RewriteRule uses a regex, and ? is a regex metacharacter, which therefore needs be escaped with a backslash (\?) to tell it to match the literal question mark character.
However, the second part of the problem is that the regex for RewriteRule is only tested against the filename part of the URL - it specifically excludes the query string.
In order to match against the query string you need to use the RewriteCond directive, placed on the line before the rule (but applied in between the RewriteRule matching and replacing), acting as an additional filter. The useful bit is that you can specify which part of the URL to match against (as well as having the option for using non-regex tests).
Bearing all this in mind, the simplest way to match/rewrite a request with a query string is:
RewriteCond %{QUERY_STRING} .
RewriteRule .* /redirect/redirect.cfm
The %{QUERY_STRING} is what the regex is tested against (everything in CF's CGI scope can be used here, and some other stuff too - see the Server Variables box in the docs).
The single . just says "make sure the matched item has any single character"
At the moment, this rule will preserve the existing query string - if you want to discard it, you can place a ? onto the end of the replacement URL. (If you need to use a query string on the URL and not discard the old version, use the [QSA] flag.)
In the opposite direction, you're losing the filename part of the URL - to preserve this, you probably want to append it onto the replacement as PATH_INFO, using the automatic whole-match capture $0.
These two things together provides:
RewriteCond %{QUERY_STRING} .
RewriteRule .* /redirect/redirect.cfm/$0?
One final thing is that you'll want to guard against infinite loops - the above rule strips the query string so it will always fail the RewriteCond, but better to be safe (especially if you might need to add a query string), which you can do with an extra RewriteCond:
RewriteCond %{QUERY_STRING} .
RewriteCond %{REQUEST_URI} !/redirect/redirect\.cfm
RewriteRule .* /redirect/redirect.cfm/$0?
Multiple RewriteCond are combined as ANDs, and the ! negates the match.
You can of course add whatever flags are required to the RewriteRule to have it behave as desired.