htaccess - rewrite url with two rules - regex

Okay, I am having a little trouble with .htaccess as I am a beginner with it. This is what I am trying to accomplish.
URL DESIRED: example.com/package/forest/
What I have now:
RewriteRule ^package/forest/$
/deep/file/do_stuff.php?action=package&location=forest [R=301,QSA,L]
This is working as intended, when you go to example.com/package/forest you get the page that is produced by do_stuff.php. The hard part that I can't get, is what if I want to send extra GET variables to the do_stuff page but keep the same URL.
STARTING URL: example.com/package/forest/list/5/2015-10-03
STILL DESIRED URL: example.com/package/forest
The difference being, I want the rewrite or redirect, not sure the term, to be
/deep/file/do_stuff.php?action=package&location=forest&pictures=5&date=2015-10-03
this time.
Not sure how to stack the rules so that I end up with the same URL whether they pass extra variables or not.

RewriteRule ^package/forest/list/(\d)/(\d{4}-\d{2}-\d{2})$ package/forest/deep/file/do_stuff.php?action=package&location=forest&pictures=$1&date=$2
The regex captures two groups, one with a single digit and one with the date. These are used later as $1 and $2.
Unfortunately I cannot test it as I am using nginx, so I am not 100% sure if I'm right.
I thought about whether the uri was going to get rewritten again after the first rewrite because "package/forest/" would still match, but the dollar sign at the end of the string should prevent any uri longer than that from matching.
For the same reason you don't have to worry about stacking; the first rule will only apply to short uri's ending with "forest/", the second only to those with a date in the end.
On the other hand I am wondering, why you are using the QSA flag. A uri containing a query will never match because of the trailing dollar sign.

Related

How to only show id value on url path with htaccess?

What I have right now is
https://www.example.com/link.php?link=48k4E8jrdh
What I want to accomplish is to get this URL instead =
https://www.example.com/48k4E8jrdh
I looked on the internet but with no success :(
Could someone help me and explain how this works?
This is what I have right now (Am I in the right direction?)
RewriteEngine On
RewriteRule ^([^/]*)$ /link.php?link=$1
RewriteRule ^([^/]*)$ /link.php?link=$1
This is close, except that it will also match /link.php (the URL being rewritten to) so will result in an endless rewrite-loop (500 Internal Server Error response back to the browser).
You could avoid this loop by simply making the regex more restrictive. Instead of matching anything except a slash (ie. [^/]), you could match anything except a slash and a dot, so it won't match the dot in link.php, and any other static resources for that matter.
For example:
RewriteRule ^([^/.]*)$ link.php?link=$1 [L]
You should include the L flag if this is intended to be the last rule. Strictly speaking you don't need it if it is already the last rule, but otherwise if you add more directives you'll need to remember to add it!
If the id in the URL should only consist of lowercase letters and digits, as in your example, then consider just matching what is needed (eg. [a-z0-9]). Generally, the regex should be as restrictive as required. Also, how many characters are you expecting? Currently you allow from nothing to "unlimited" length.
Just in case it's not clear, you do still need to change the actual URLs you are linking to in your application to be of the canonical form. ie. https://www.example.com/48k4E8jrdh.
UPDATE:
It works but now the site always sees that page regardless if it is link.php or not? So what happens now is this: example.com/idu33d3dh#dj3d3j And if I just do this: example.com then it keeps coming back to link.php
This is because the regex ^([^/.]*)$ matches 0 or more characters (denoted by the * quantifier). You probably want to match at least one (or some minimum) of character(s)? For example, to match between 1 and 32 characters change the quantifier from * to {1,32}. ie. ^([^/.]{1,32})$.
Incidentally, the fragment identifier (fragid) (ie. everything after the #) is not passed to the server so this does not affect the regex used (server-side). The fragid is only used by client-side code (JavaScript and HTML) so is not strictly part of the link value.

htaccess - redirect string containing part of specific string

UPDATED - my initial question wasn't quite correct. (apologies to all concerned)
UPDATED again - (this is not my day today..)
I need to redirect all incoming image requests for:
http://www.example.com/images/asd12catalog.jpg (there is an additional alpha character)
To:
http://www.example.com/images/as-d12.jpg (I have added the "-")
So I need to strip out the word catalog and change the first portion of the filename to add a "-" making as-d12.jpg.
I have tried variations on:
RewriteRule ^/images/[a-z0-9]catalog.jpg$ /images/$1.jpg
But I just can't seem to get a match.
Can anyone help please?
Thanks.
Your attempt was very close, the only major problem being that you did not actually wrap anything in your regex as a capture group. By placing parentheses around [a-zA-Z]*[0-9]* below, it will be available in the variable $1 after the match has finished. You can then use this as you expected in your redirect URL.
RewriteRule ^/images/([a-zA-Z]{2})([a-zA-Z]{1})([0-9]*)catalog.jpg$ /images/$1-$2$3.jpg
Demo:
Regex101
RewriteRule ^/?images/([a-zA-Z]{2})([a-zA-Z]{1})([0-9]+)catalog.jpg$ /images/$1-$2$3.jpg
You're not specific about the exact format of your filenames, but this will match anything followed by catalog.jpg, which will hopefully cover any requirements.
Also note that the leading / should at most be optional when matching in rewrite rules - they haven't been part of the path parsed by RewriteRule since version 1. See https://webmasters.stackexchange.com/questions/27118/when-is-the-leading-slash-needed-in-mod-rewrite-patterns
Edit: updated again for new requirement

Regex to match a URL with parameters but reject URL with subfolder

Short Question: What regex statement will match the parameters on a URL but not match a subfolder? For example match google.com?parameter but not match google.com/subdomain
Long Question: I am re-directing a few URLs on a site.
I want a request to ilovestarwars.com/page2 to re-direct to ilovestarwars.com/forceawakens
I setup this re-direct and it works great most of the time. The problem is when there are URL parameters. For example if someone sends the URL using an email program that tracks links. Then ilovestarwars.com/page2 becomes ilovestarwars.com/page2?parameter=trackingcode123 after they send it which results in a 404 on my site because it is looking for the exact URL.
No problem, I will just use Regex. So I now re-direct using ilovestarwars.com/page2(.*) and it works great accepts all the parameters, no more 404s.
However, trying to future proof my work, I am worried, what happens if someone adds content inside the page2 folder? For example ilovestarwars.com/page2/mistake
They shouldn't, but if they do, it will take them forever to figure out why it is redirecting.
So my question is, how can I create a regex statement that will match the parameters but reject a subfolder?
I tried page2(.*?)/ as is suggested in this answer, but https://www.regex101.com/ says the slash is an unescaped delimiter.
Background info as suggested here, I am using Wordpress and the Redirection plugin. This is the article that goes over the initial redirect I setup.
A direct answer to your question would be something like this: ^/([^?&/\]*)(.*)$
This assumes the string starts at the first / (if it doesn't, remove the / that follows the ^). In the first capture group you will get the page name (page2, in the case of your example URL) and in the second capture group, you will get the remaining part of the url (anything following one of these chars: ?, &, /, \). If you don't care about the second capture group, use ^/([^?&/\]*).*$
An indirect answer would be that you don't do it this way. Instead, there should be an index page in folder page2 that uses a 301 redirect to redirect to the proper page. It would make much more sense to do it statically. I understand that you may not have that much control over your webpage, though, since it is Wordpress, in which case the former answer should work with the given plugin.

.htaccess dash separator 2 params or more

I'm working with .htaccess, and I just want to make friendly URLs.
My current URL:
www.url.com/index.php?v=[SOMETHING]&i=[IDIOM]
But it can also be:
www.url.com/index.php?v=[SOMETHING]
What I want:
www.url.com/[SOMETHING]-[IDIOM]
Or in the second case:
www.url.com/[SOMETHING]
.htaccess config I made:
RewriteRule ^([^/]+)-([^/]+)?$ index.php?v=$1&i=$2
Writing www.url.com/[SOMETHING]-[IDIOM] goes okay, with the webpage running properly.
But in the second case I have to write www.url.com/[SOMETHING]**-**. If I write www.url.com/[SOMETHING] the page breaks.
So, I want to make the second param optional, and separated by a dash if it's possible.
Can any one help me please?
It won't match www.url.com/[SOMETHING] because it expects a - at the end. Include the - inside the second capturing group, like so:
^([^/]+?)(-[^/]+)?$
Also changed the first greedy quantifier to a lazy one to avoid matching too much.
You can just add the second rule
RewriteRule ^([^/]+)-([^/]+)?$ index.php?v=$1&i=$2
RewriteRule ^([^/]+)$ index.php?v=$1
I would suggest you to use a project called Helium. The documentation is available here : Visit https://github.com/iamyogik/Helium
It will do all this stuff for you and you do not have to worry about anything. You can pass variable parameters between /{var1}/{var2}/ and also use $_GET or $_POST requests at the same time.

More efficient RewriteRule for messy URL

I have developed a new web site to replace an existing one for a client. Their previous site had some pretty nasty looking URLs to their products. For example, an old URL:
http://mydomain.com/p/-3-0-Some-Ugly-Product-Info-With-1-3pt-/a-arbitrary-folder/-18pt/-1-8pt-/ABC1234
I want to catch all requests to the new site that use these old URLs. The information I need out of the old URL is the ABC1234 which is the product ID. To clarify, the old URL begins with /p/ followed by four levels of folders, then the product ID.
So for example, the above URL needs to get rewritten to this:
http://mydomain.com/shop/?sku=ABC1234
I'm using Apache 2.2 on Linux. Can anyone point me on the correct pattern to match? I know this is wrong, but here is where I am currently at:
RewriteRule ^p/([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/([A-Za-z0-9-]+)/([A-Za-z0-9-]+)?$ shop/?sku=$5 [R=301,NC,L]
I'm pretty sure that the pattern used to match each of the 4 folders is redundant, but I'm just not that sharp with regex. I've tried some online regex evaluators with no success.
Thank you.
--EDIT #1--
Actually, my RewriteRule above does work, but is there a way to shorten it up?
--EDIT #2--
Thanks to ddr, I've been able to get this expression down to this:
RewriteRule ^p/([\w-]+/){4}([\w-]+)$ shop/?_sku=$2 [R=301,NC,L]
--EDIT #3--
Mostly for the benefit of ddr, but I welcome anyone to assist who can. I'm dealing with over 10,000 URLs that need to be rewritten to work with a new site. The information I've provided so far still stands, but now that I am testing that all of the old URLs are being rewritten properly I am running into a few anomolies that don't work with the RewriteRule example provided by ddr.
The old URLs are consistent in that the product ID I need is at the very end of the URL as documented above. The first folder is always /p/. The problem I am running into at this point is that some of the URLs have a URL encoded double quote ("). Additionally, some of the URLs contain a /-/ as one of the four folders mentioned. So here are some examples of the variations in the old URLs:
/p/-letters-numbers-hyphens-88/another-folder/-and-another-/another-18/ABC1234
/p/-letters-numbers-hyphens-88/2%22/-/-/ABCD1234
/p/letters-numbers-hyphens-1234/34-88/-22/-/ABCD1234/
Though the old URLs are nasty, I think it is safe to say that the following are always true:
Each begins with /p/
Each ends with the product ID that I need to isolate.
There are always four levels of folders between /p/ and the product ID.
Some folders in between have hyphens, some don't.
Some folders in between are a hyphen ONLY.
Some folders in between contain a % character where they are URL encoded.
Some requests include a / at the very end and some do not.
The following rule was provided by ddr and worked great until I ran into the URLs that contain a % percent sign or a folder with only a hyphen:
RewriteRule ^p/(?:[\w-]+/){4}([\w-]+)$ shop/?_sku=$1 [R=301,NC,L]
Given the rule above, how do I edit it to allow for a folder that is hyphen only (/-/) or for a percent sign?
You can use character classes to reduce some of the length. The parentheses (capture groups) are also unnecessary, except the last one, as #jpmc26 says.
I'm not especially familiar with Apache rules, but try this:
RewriteRule ^p/(?:[\w-]+/){4}([\w-]+)$ shop/?sku=$1 [R=301,NC,L]
It should work if extended regular expressions are supported.
\w is equivalent to [A-Za-z0-9_] and you don't need to not capture underscores, so that's one replacement.
The {4} matches exactly four repetitions of the previous group. This is not always supported so Apache may not like it.
The ?: is optional but indicates that these parens should not be treated as a capture. Makes it slightly more efficient.
I'm not sure what the part in [] at the end is for but I left it. I can't see why you'd need a ? before the $, so I took it out.
Edit: the most compact way, if Apache likes it, would probably be
RewriteRule ^p(/[\w-]+){5}$ shop/?sku=$5 [R=301,NC,L]
EDIT: response to edit 3 of the question.
I'm surprised it doesn't work with only -. The [\w-]+ should match even where there is just a single -. Are you sure there isn't something else going on in these URLs?
You might also try replacing - in the regex with \-.
As for the %, just change [\w-] to [\w%-]. Make sure you leave the - at the end! Otherwise the regex engine will try to interpret it as part of a char sequence.
EDIT 2: Or try this:
RewriteRule ^p/(?:.*?/){4}(.*?)/?$ shop/?sku=$1 [R=301,NC,L]