how to consider regex expression as block to apply lookbehids and lookaheads? - regex

I'm trying to turn a string of this type:
http://example.com/mypage/272-16+276-63+350-02
where aaa-bb are product codes and their numbers may vary from 2 to anything, but I doubt there will ever be more than 5 into:
http://example.com/mypage/272-16+276-63+350-02/?skus=272-16+276-63+350-02
using a redirect match. I'm fairly new to regular expressions and I don't seem to be able to get the negative lookahead and lookbehind to work the way I want.
To capture the string the first time is fairly easy, I used ([\-\+0-9]+) but I don't want it to match on redirection (when I already have a ? in my link). Using ([\-\+0-9]+)(?!\?)(?<\?) doesn't do the trick, it only excludes my last digit from the match. So, is there a way to make regex consider all my product codes as one block, so I can than check if there is a question mark before or after it?
Thank you for looking into this.

You can't mix mod_rewrite (RewriteCond) and mod_alias (RedirectMatch) together. You need to stick with one or the other and you can't match the query string with a RedirectMatch, so you're using mod_rewrite:
RewriteEngine On
RewriteCond %{QUERY_STRING} !skus=
RewriteRule ^mypage/([\-\+0-9]+)$ /mypage/$1?skus=$1 [L,R=301]

Maybe try (?<=http://example.com/mypage/)[0-9+-]+$ Will match only the first case.

Related

htaccess - redirect string containing part of specific string

UPDATED - my initial question wasn't quite correct. (apologies to all concerned)
UPDATED again - (this is not my day today..)
I need to redirect all incoming image requests for:
http://www.example.com/images/asd12catalog.jpg (there is an additional alpha character)
To:
http://www.example.com/images/as-d12.jpg (I have added the "-")
So I need to strip out the word catalog and change the first portion of the filename to add a "-" making as-d12.jpg.
I have tried variations on:
RewriteRule ^/images/[a-z0-9]catalog.jpg$ /images/$1.jpg
But I just can't seem to get a match.
Can anyone help please?
Thanks.
Your attempt was very close, the only major problem being that you did not actually wrap anything in your regex as a capture group. By placing parentheses around [a-zA-Z]*[0-9]* below, it will be available in the variable $1 after the match has finished. You can then use this as you expected in your redirect URL.
RewriteRule ^/images/([a-zA-Z]{2})([a-zA-Z]{1})([0-9]*)catalog.jpg$ /images/$1-$2$3.jpg
Demo:
Regex101
RewriteRule ^/?images/([a-zA-Z]{2})([a-zA-Z]{1})([0-9]+)catalog.jpg$ /images/$1-$2$3.jpg
You're not specific about the exact format of your filenames, but this will match anything followed by catalog.jpg, which will hopefully cover any requirements.
Also note that the leading / should at most be optional when matching in rewrite rules - they haven't been part of the path parsed by RewriteRule since version 1. See https://webmasters.stackexchange.com/questions/27118/when-is-the-leading-slash-needed-in-mod-rewrite-patterns
Edit: updated again for new requirement

htaccess regex matching path containing 'foo' without 'bar' in it

I am trying to block some URL via a .htaccess file with RedirectMatch 403.
Anything that contains YES should be matched only if NO is not in the URL too. Some examples of matching URLs or not:
/dir/YES/ -> yes
/YES/file.ext -> yes
/dir/NO/dir/YES/file.ext -> no
/NO/dir/dir/YES/file.ext -> no
Also there is an unknown number of dir between /YES and /NO.
I've tried various lookbehind and patterns like:
(?<!themes/)(vendor)
(?<=[^themes])(/vendor)
(?<![themes/])(/|/[^themes]+/)vendor(/|$)
But am struggling to get anything working and wondering if this is actually a good idea.
Use it with a specific RewriteCond like this:
RewriteCond %{REQUEST_URI} !/NO/ [NC]
RewriteRule (^|/)YES(/.*|)$ - [F,L,NC]
Why you're patterns aren't working:
(?<!themes/)(vendor)
Will only discard the match if themes/ is immediately to the left of vendor. Lookbehinds do not traverse the entire string automtically.
(?<=[^themes])(/vendor)
Character classes don't work like this. [^themes] matches a single character that is not one of e, h, m, s or t.
What you would want is (?<!themes.*/)(vendor) but only .NET allows lookbehinds of arbitrary length.
The trick is to start at the beginning and make sure that there is no NO on the way to YES using lookaheads:
^(?!.*NO).*(YES)
or
^(?!.*themes).*(vendor)
as lookaheads can be of variable length. If NO is allowed to appear after YES, you have to check at every single character on the way to YES:
^((?!NO).)*(YES)
^((?!themes).)*(vendor)
EDIT: anubhava's solution is actually much neater. If you're ever in need of a single-regex solution you can use mine as a reference.

How do I create a regular expression to negate a string but include anything else

I'm trying to create a regex for apache that will ignore certain strings, but will use anything else. Ive tried many different methods however i just can't seem to get it correct.
for example
i want it to ignore
ignore.mysite.com
but anything else i want it to use
*.mysite.com
If you want a regex that matches whatever.mysite.com where whatever is any possible hostname, but you want the regex not to match ignore.mysite.com, then try this:
^(?!ignore)[a-z0-9-]+\.mysite\.com
The trick is to use negative lookahead.
If you want this regular expressions for mod_rewrite, you can use RewriteCond with negation, just like this:
RewriteCond %{REMOTE_HOST} !^ignore.example.com$
RewriteRule ....
You can find more, of course, in documentation.

Regex to match a URL pattern for a htaccess file

I'm really struggling with this regex.
We're launching a new version of our "groups" feature, but need to support all the old groups at the same URL. Luckily, these two features have different url patterns, so this should be easy? Right?
Old: domain.com/group/$groupname
New: domain.com/group/$groupid/$groupname
The current regex I was using in the regex to redirect old groups to the old script was: RewriteRule ^(group/([^0-9/].*))$ /old_group.php/$1 [L,QSA]
It was working great until a group popped up called something like: domain.com/group/2009-neato-group
At which point the regex matched and forced the new group to the old code because it contained numbers. Ooops.
So, any thoughts on a regex that would match the old pattern so I can redirect it?
Patterns it must match:
group/my-old-group
group/another-old-group/
group/123-another-old-group-456/a-module-inside-the-group/
Patterns it cannot match:
group/12/a-new-group
group/345/another-new-group/
group/6789/a-3rd-group/module-abc/
The best way I've thought about it is that it cannot start with "group-slash-numbers-slash", but haven't been able to turn that into a regex. So, something to negate that pattern would probably work.
Thanks
Think the other way around: If the url matches ^group/[0-9]+/.+, then it is a new group, so redirect these first (with a [L] rule). Everything else must be a old group, you can keep your old rules there.
Your design choices make a pure-htaccess solution difficult, especially since group/123-another-old-group-456/a-module-inside-the-group/ is a valid old-style URL.
I'd suggest a redirector script that looks at its arguments, determines if the first part is a valid groupid, and if so, it's a new-style URL takes the user to that group. Else, it is a old-style URL, so hand it off to old_group.php. Something like this:
RewriteRule ^group/(.*)$ /redirector.php/$1 [L,QSA]

RegEx match replace help

I am trying to do a regex match and replace for an .htaccess file but I can't quite figure out the replace bit. I think I have the match part right, but maybe someone can help.
I have this url-
http://www.foo.com/MountainCommunities/Lifestyles/5/VacationHomeRentals.aspx
And I'm trying to turn it into this-
http://www.foo.com/mountain-lifestyle/Vacation-Home-Rentals.aspx
(MountainCommunities/Lifestyles)/\d/(.*)(.aspx)
and then I figured I would have a rewrite rule starting like this-
mountain-lifestyle/$2$3
but I need to take what is in $2 in this instance and rewrite it to place dashes between the words with capital letters. Now I'm stumped.
I think you'll have to do it in two bits... Take out $2, precede every capital (apart from the first) with a -, then use just append the result to http://www.foo.com/mountain-lifestyle/ with a .aspx on the end.
Try this:
RewriteRule ^(([A-Z][a-z]+-)*)([A-Z][a-z]+)(([A-Z][a-z]+)+)(\.aspx)?$ /$1$3-$4 [N]
RewriteRule ^([A-Z][a-z]+-)+[A-Z][a-z]+$ /$0.aspx [R=301]
Note that mod_rewrite uses an internal counter to detect and avoid infinit loops. So your URL may not contain too much words having to be converted (see MaxRedirects option for Apache < 2.1 and LimitInternalRecursion directive for Apache ≥ 2.1).
I don't think what your doing with the capital letters is possible with regex...
You would be better keeping the dashes in the URL and removing the .aspx
eg: http://www.foo.com/MountainCommunities/Lifestyles/5/Vacation-Home-Rentals
This would require the following rule:
^/MountainCommunities/Lifestyles/5/([^/]+)/\?([^/]+) /mountain-lifestyle/$1.aspx?$2 [I]
This also takes into account any querystrings that are sent to the page.
BTW: How are you using .htaccess with IIS?
You can use the regular expression "([A-Z])" on the middle bit "VacationHome", replacing with the regex "-$1" - This will give you "-Vacation-Home-Rentals" - Then you can just chop off the first character, and stick the first part of the URL on the front, and .aspx on the end.
I think the main regex has been written by others, but to match the request name to place dashes (assuming all the file names have a three-name camel cased representation ala 'VacationHomeRentals.aspx':
RewriteRule: ^/MountainCommunities/Lifestyles/\d+/([A-Z][a-z]+)([A-Z][a-z]+)([A-Z][a-z]+)\.aspx$ /mountain-lifestyle/$1-$2-$3.aspx
This is a restricted version of #Gumbo's response, as I have not had a chance to test his recursion. The recursion technique is definitely the best and most usable for any scenario.
I don't think I quite understand what you are trying to do. Why can't you simply search for:
http://www.foo.com/MountainCommunities/Lifestyles/5/VacationHomeRentals.aspx
and replace it with:
http://www.foo.com/mountain-lifestyle/Vacation-Home-Rentals.aspx ?
Or is this a specific example of a patten you are trying to transform?