Regular Expression for Mod rewrite via HTACCESS - regex

I need a regular expression which I can use in an HTACCESS file to rewrite:
http://www.sample.com/dir/1-2-3.php
1 = lower case letters only, no limit on how many
2 = alpha numeric (lower case letters only) and dashes, no limit on how many characters
3 = alpha numeric (upper case letters only), no limit on how many characters
(NOTE: The dashes between 1, 2, 3 are intentional, and will be present in the URL)
to
http://www.sample.com/dir/sub/page.php?v=ABC12345
Where ABC12345 is #3 from the original URL.

If I'm understanding correctly, the following should work.
RewriteEngine On
RewriteRule ^([a-z]*)-([a-z0-9-]*)-([A-Z0-9]*)\.php /$1/$2.php?v=$3 [L]
Hope this helps.

Try this rule in your .htaccess:
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^dir/([a-z]*)-([a-z0-9-]*)-([A-Z0-9]*)\.php$ /dir/sub/page.php?v=$3 [R=301,L,NE,QSA]
R=301 will redirect with https status 301
L will make last rule
NE is for no escaping query string
QSA flag will make sure to append existing query parameter with additional query parameters
$3 is 3rd capture group in your REQUEST_URI

Related

How to match string if it doesn't contain only numbers after slash?

I am redirecting certain urls with path to get variables like the following:
localhost2/post/myTitle => localhost2/post.php?title=myTitle
localhost2/post/123 => localhost2/post.php?id=123
So In my htaccess file, I use
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule ^post/(\d+) post.php?id=$1
RewriteRule ^post/(.*) post.php?title=$1
</IfModule>
This works no problem. But I want to learn how to write negative of ^post/(\d+), that is ^post/(NEGATE-ONLY-NUMBERS). In other words I want a regex that matches the whole input sting if there is not only numbers after post/. So post/abc, post/a23, post/ab3, post/12c and post/a2c should all pass but not post/123. I refered to this post, which suggest using:
(?!^\d+$)^.+$
I can't use ^post/(?!^\d+$)^.+$, because there can be only one ^ and one $. I don't know what regex anchor specifies first position in a substring. My best guess is
post\/(?!\d++).*
I think (?!\d++), with the ++ would eat all characters followig and check if all are digits. But this fails at post/1ab.
Another guess is:
post\/(?![\d,\/]+$).*
The works the best but it allows: post/3455/X.
Secondly, eventually I need to convert localhost2/post/myTitle/123 => localhost2/post.php?title=myTitle&repeat=123 as well. I ave come up with the following:
^post/(?!\d+($|/))(.+?($|/))(\d+$)?
Note: +? to use lazy quantifier, otherwise multiple slashes will be matched by .
and
^post/(?!\d+($|/))([^/\n\r]+($|/))(\d+$)?
Here I use [^/\n\r] instead of .+?
Patterns inside zero-width assertions like (?!\d++) are non-consuming, they do not "eat" chars, they only check the context while keeping the regex index at the same location as before matching the zero-width assertion pattern.
You can use any of the following:
^post/(?!\d+(?:/|$)).*
^post/(?!\d+(?=/|$)).*
^post/(?!\d+(?![^/])).*
See the regex demo. Details:
^post/ - start of input, post/ literal string
(?!\d+(?=/|$)) - a negative lookahead that fails the match if, immediately to the right of the current location, there are one or more digits followed with / or end of string
.* - the rest of the input.
Do not over complicate things when you can keep things simple by keeping 3 separate rewrite rules and since your query parameters are named differently you will need 3 separate rewrite rules anyway.
Consider:
Options -MultiViews
RewriteEngine On
RewriteRule ^post/(\d+) post.php?id=$1 [L,QSA,NC]
RewriteRule ^post/([^/]+)/(\d+) post.php?title=$1&repeat=$2 [L,QSA,NC]
RewriteRule ^post/([^/]*) post.php?title=$1 [L,QSA,NC]
Take note of Options -MultiViews. If this is not enabled in Apache config you must have it here otherwise it will keep all $_GET parameters empty in your php file.
Option MultiViews (see http://httpd.apache.org/docs/2.4/content-negotiation.html) is used by Apache's content negotiation module that runs before mod_rewrite and makes Apache server match extensions of files. So if /file is the URL then Apache will serve /file.html.

Rewrite URL condition - Change particular string & remove trailing numbers

What's the best way to rewrite URLs as 301 redirects with the following conditions?
Sample old URLs to rewrite:
/c/garments-apparel/red-yellow-polka-dress-10_450
/c/shoes-and-accessories/black-suede-boots-02_901
Conditions:
Change c to category
Remove trailing number (including connecting dash) from URL (example: -10_450 and -02_901)
New URLs should be:
/category/garments-apparel/red-yellow-polka-dress
/category/shoes-and-accessories/black-suede-boots
Note that changes will be applied to an .htaccess file on a Wordpress environment.
You can have this rule just below RewriteEngine On line:
RewriteEngine On
RewriteRule ^c/([\w-]+/.+)-[\d_]+/?$ /category/$1 [L,NC,R=301]
you can use the regex
[-_]\d+
to replace the trailing numbers with "" (empty string) demo
then use the regex
\/c\/
and replace with /category/ demo

Unknown number of regex replacements, how?

I need to change a large number of URIs in the following way:
substitute %20 separators with dashes -,
substitute the old root with a new domain.
Examples:
/old_root/first/second.html -> http://new_domain.com/first/second
/old_root/first/second%20third.html -> http://new_domain.com/first/second-third
/old_root/first/second%20third%20fourth.html -> http://new_domain.com/first/second-third-fourth
The best I came up with using regex is to write as many pattern-replacement rules as the maximum number of %20 separators that can occur in my URIs:
old_root/(.*?)/(.*?)\.html -> http://new_domain.com/$1/$2
old_root/(.*?)/(.*?)%20(.*?)\.html -> http://new_domain.com/$1/$2-$3
old_root/(.*?)/(.*?)%20(.*?)%20(.*?)\.html -> http://new_domain.com/$1/$2-$3-$4
My question is: is it possible to obtain the same result using a single regular expression rule?
I thought I could use a two-step approach: first change all %20 separators to - and then use the rule old_root/(.*?)/(.*?)\.html -> http://new_domain.com/$1/$2/. However, I need to apply this rule in a .htaccess file as a RedirectMatch directive and, as far as I know, it is not possible to use two successive rules for the same redirect directive.
It turns out that Apache recursively applies all regex rules until they stop matching. Therefore, I am allowed to write two rules rather than one to solve my problem.
The following rules do what I was looking for, and more; I have tested them on my local Apache server and they work fine. Note that for them to work, you need to first turn on the rewrite engine by prepending
RewriteEngine on
Options +FollowSymlinks -MultiViews
in the local .htaccess file or in the global httpd.conf file.
Replace all spaces with hyphens
Replace both literal spaces and %20 with hyphens:
RewriteRule ^(.+)(\s|%20)(.+)$ /$1-$3 [R=301,NE,L]
Replace all apostrophes with hyphens
Replace all literal apostrophes and %60 with hyphens:
RewriteRule ^(.+)('|`|%60)(.+)$ /$1-$3 [R=301,NE,L]
Delete the trailing .html extension
RewriteRule (.+)\.html$ $1 [R=301,L]
Convert the last field in the URL to lower case
Convert the last field in the URL to lower case and prepend the new domain:
RewriteRule /whatever/(.*?)/(.*?)/(.*) http://new.domain.com/$1/$2/${lc:$3} [R=301,L]
Important: The lowercase conversion will only work if you include the following lines at the end of the Apache configuration file httpd.conf, which is usually located in the etc directory on the server:
RewriteEngine on
RewriteMap lc int:tolower
A last note: I recommend prepending each rule with a RewriteCond directive to restrict the field of application of the rule. For example, to apply the space-to-hyphens rule only to those URI that match a certain regex, you should write the following in your .htaccess file:
RewriteCond %{REQUEST_URI} regex_for_URIs
RewriteRule ^(.+)(\s|%20)(.+)$ /$1-$3 [R=301,NE,L]
where regex_for_URIs is the regular expression that the URI must match in order for the next RewriteRule to be applied; it can also be a simple string.
Well, you were almost done.
Problems
Don't return "%20" - We'll Use them as "delimiter" of parts of the path
Add condition on third & fourth group (because you might pass short URL i.e. your examples)
Solution
\/old_root\/(.*?)\/(\w*)(?:%20)?(\w*)?(?:%20)?(\w*)?\.html
See Demo
Explanation
(?:%20)? means "%20" is non catching group that can occurs 0 or 1 time.
Same logic applyies on possible 3rd & 4th part.

htacess mod_rewrite regex - include numbers but exclude letters

I have always struggled with regex, and after reading up on it for 45 mins my head is spinning. Negative lookaheads, what the...? (?:/(?:(?!s\d+).)*)+$<--- OMG!!!
:(
So, I have a rule
RewriteRule /([0-9]+) /?id=$1 [R]
and it works fine when the url is www.hi.com/123
How can I make it refresh to / (the document root i nthis case) if the url is www.hi.com/123abc or www.hi.com/a123bc?
I just want to make sure only urls with numbers and nothing else are matched.
I tried
RewriteRule /([0-9]+)([^a-z]+) /map.htm?marker=$1 [R]
But that refreshes towww.hi.com/?id=404, oddly enough.
To match 1 or more numbers in regex it will be:
[0-9]+
To match 1 or more numbers or letters in regex it will be:
[0-9a-zA-z]+
For your case RewriteRule rule will be:
RewriteRule ^([0-9a-z]+)/?$ /map.htm?marker=$1 [NC,L,R=302]
which will match /123abc OR /123abc/ OR /123 OR /abc/, note that trailing slash is optional. Flags I used are:
NC - Ignore Case
L - Last
R=301 - Use http status 302 for resulting URL
I would strongly suggest you reading mod_rewrite Reference doc: Apache mod_rewrite Introduction
However you also asked:
How can I make it refresh to / (the document root i nthis case) if the
url is www.hi.com/123abc or www.hi.com/a123bc?
That rule would be:
RewriteRule ^([0-9a-z]+)/?$ / [NC,L,R=302]

Help with Regex to match and rewrite URI

I need to have a RegEx that will match a URI like this based on the subdomain "blog"--
http://blog.foo.com/2010/06/25/city-tax-sale/
and redirect like this (getting rid of the subdomain and numbers/date)--
http://foo.com/city-tax-sale/
where the last bit "city-tax-sale" would be a wildcard. So basically any incoming URI that starts with 'blog.foo.com' would be redirected to 'foo.com' + 'whatever is at the end of the above URI after the three sub paths with numbers.
I hope that makes sense. Just trying to create one redirect instead of writing every single one.
This will explicitly match your date format, rather than any series of digits and slashes:
RewriteCond %{HTTP_HOST} ^blog\.foo\.com$ [NC]
RewriteRule ^/\d{4}/\d{2}/\d{2}/(.*)$ http://foo.com/$1 [L,R=301]
The regex part can be broken does to:
^ # start of non-domain url
/\d{4} # slash followed by 4 digits
/\d{2} # slash followed by 2 digits
/\d{2} # slash followed by 2 digits
/ # closing slash
(.*) # rest of the url, captured to group 1
$ # end of url
With the $1 in the replacement being group 1.
In the options part:
L is for "Last" - tells it to not bother looking at other rules.
R=301 is for Redirect with 301 header, which means permanent redirect (just R would send a temporary 302 header)
The RewriteCond bit performs a case-insensitive (NC option) check on the HTTP_HOST header (supplied by user/client) and if it starts blog.foo.com it performs the rewrite, otherwise it doesn't.
RewriteCond %{HTTP_HOST} ^blog.foo.com [NC]
RewriteRule ^(\d+/)+(.*)/?$ http://foo.com/$2 [L,R=301]
You can try this:
/http:\/\/blog\..*\.[a-zA-Z]{2,5}\/[0-9]{4}\/[0-9]{2}\/[0-9]{2}\/(.*)\//