Htaccess RewriteRule for optional trailing slash - regex

I have a rewrite rule
RewriteRule ^(la|en|me)/?(.*)$ /go/$2 [R=302,L]
When user visit /la/route-1 it will be redirected to /go/route-1.
Visit /en it will be redirected to /go/.
It works great, but I got an issue about this:
If someone visit /eng/route-1 it will be redirected to /go/g/route-1, it shouldn't trigger redirection.
Have any idea with that?

This will probably do:
RewriteRule ^(?:la|en|me)(?:/(.*))?$ /go/$1 /go/$1 [R=302,L]
You capture the separating slash along with the path component you want to capture.
Note: I also suggest to make the first group a "non capturing group" ((?: ... )), since you are not interested in what it currently captures. Consequently you have to use $1 then in the target path.

Related

.htaccess redirect is sending whole string, instead of partial

I partially have my .htaccess rule working. What I have currently is:
#tag to search redirect
RewriteCond %{REQUEST_URI} ^/tag\/*
RewriteRule ^(.*) https://www.testurl.co.uk/search-results?hsf=$1&id=12 [R=301,L]
What is currently happening, is where the $1 is, the entire of tag/* is going in there.
i.e request is tag/test URL generated is
https://www.testurl.co.uk/search-results?hsf=tag/test&id=12
when it should ideally be:
https://www.testurl.co.uk/search-results?hsf=test&id=12
Any help would be greatly appreciated.
Many thanks
You can use this rule:
RewriteRule ^tag/(.+)$ https://www.testurl.co.uk/search-results?hsf=$1&id=12 [R=301,L,QSA]
Pattern ^tag/(.+)$ will capture any value after /tag/ into group #1 and that is being used in $1.
Make sure to clear your browser cache before testing this.

Correct way to redirect url to specific language with parameter at the end of the url

What I would need is to redirect with htaccess an url that has already some google analytics parameters in it, and redirect it adding the language parameter at the end of the url. I tried this with no luck so far but I know its wrong at some point:
RewriteCond %{HTTP:Accept-Language} ^fr [NC]
RewriteRule ^random?$1 http://www.domain.com/random?$1&language=french [R=301,QSA]
I'm trying with $1, I don't know if this is correct, the intention of which is to include all the parameters like utm_source=1&utm_medium=2, after this I would need to include the language parameter, so redirected url should look like http://www.domain.com/random?utm_source=1&utm_medium=2&language=french.
What is the right way to achieve what I need?
Thank you in advance.
Back-References
In your rule, $1 is a back-reference to a capture group that doesn't yet exist.
Try this instead:
RewriteCond %{HTTP:Accept-Language} ^fr [NC]
RewriteCond %{QUERY_STRING} !^.*language=french
RewriteRule ^(random.*) $1?language=french [R=301,QSA,L]
At the beginning ^ of the url path, (random.*) matches random and the rest of the url. (So we are only rewriting urls that start with random... is that what you want?)
We rewrite with the back-reference $1, followed by the query string ?language=french
The QSA flag ensures that existing query-string parameters are added.

Regex help for addon domain

we recently moved to an addon domain. I'm having problems with the redirect. I need to redirect for the following situations:
mydomain.com/mysubfolder/ to mysubfolder.com
mydomain.com/mysubfolder/wp-login.php? to mysubfolder.com/wp-login.php?
mydomain.com/mysubfolder/page/ to mysubfolder.com/page/ [there are many pages]
This is my current regex:
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/wp-login.php
RewriteRule ^/mysubfolder/(.*)$ http://www.mysubfolder.com/$1 [L,R=301]
It takes care of points 1 and 2. But I can't figure how to take care of point 3.
Thanks in advance.
Your regex redirects all pages except wp-login.php. It also does not account for the URL without the trailing slash. I added the ? to correct this.
RewriteEngine on
RewritRule ^/mysubfolder/?(.*)$ http://www.mysubfolder.com/$1 [L,R=301]

.htaccess rewrite rule for adding a trailing slash by unmatching a string in the URL

The following .htaccess rule un-matches string admin and adds a trailing slash(/) to that URL if admin is not found in the URL
RewriteRule ^((?!admin).)*((?!\/).)$ /$1/ [L,R]
But it has an error, and it is
http://www.domain.com/index
should result to :
http://www.domain.com/index/
But currently it is resulting:
http://www.domain.com/inde/
Please find a solution to correct it.
Thanks Very much .
Your expression captured the last character in a group.
This will solve the issue:
RewriteRule ^(?!.*admin)(.*?)\/?$ /$1/ [L,R]
Check out the explained demo here: http://regex101.com/r/kL6pV1
Note: this will invalidate any URL that contains admin, not necessarily starting with admin

Understanding RegEx - SEO Duplication on last term

i have a problem with duplicate pages for SEO on a website i'm trying to fix. www.example.com/category/c1234 loads just the same as www.example.com/category/c1234garbage
I've been reading online and testing the code and so far I narrowed it down to a possible regex problem. I have the following lines
# url rewrites
RewriteCond %{REQUEST_URI} ^/index\.cfm/.+ [NC]
RewriteRule ^/index.cfm/(([^/]+)/?([^/]+)?)/?(.*)? /index.cfm/$4?$2=$3 [NS,NC,QSA,N,E=SESDONE:true]
I added an R in the rule so I could see if it was passing through there and it is and after it passes that the garbage at the end disappears.
Can someone help me understand this and figure out a way to fix it so when you go to www.example.com/category/c1234garbage it redirects to www.example.com/category/c1234
I've been searching online for quite a while now and thought it might be time to post here since I can't seem to find a solution. I'm reading "Mastering Regular Expressions" but it might take take a while for me to find the answers I'm looking for.
I appreciate any help you can give me. Thank you.
EDIT: This is what i have before that
RewriteEngine On
Rewritebase /
# remove trailing index.cfm
RewriteCond %{QUERY_STRING} ^$
RewriteRule ^index.cfm(\?)?$ / [R=301,L]
# remove trailing slash
RewriteCond %{QUERY_STRING} ^$
RewriteRule (.*)/$ /$1 [R=301,L]
# Remove trailing ?
RewriteCond %{THE_REQUEST} \?\ HTTP [NC]
RewriteRule ^/?(index\.cfm)? /? [R=301,L]
# SEF URLs
SetEnv SEF_REQUEST false
RewriteRule ^[a-z\d\-]+/[a-z]\d+/? /index.cfm/$0 [NC,PT,QSA,E=SEF_REQUEST:true]
RequestHeader add SEF-Request %{SEF_REQUEST}e
RewriteCond %{HTTP:SEF_REQUES} ^true$ [NC]
RewriteRule . - [L]
EDIT: I was reading the htaccess again and found this that I don't understand but it might have some connection. It's located at the bottom of the file.
# lowercase the hostname, and set the TLD name to an enviroment variable
RewriteCond ${lowercase:%{SERVER_NAME}|NONE} ^(.+)$
RewriteCond %1 ^[a-z0-9.-]*?[.]{0,1}([a-z0-9-]*?\.[a-z.]{2,6})$
RewriteRule .? - [E=TLDName:%1]
From your description and your code, it sounds like this is the transformation that's happening here:
www.example.com/category/c1234garbage
↓
www.example.com/index.cfm?category=c1234garbage
So the problem, I think, is not your rewriting rules. The problem is how you're handling querystring parameters on the server side. If you have an actual page called index.cfm that's interpreting those parameters, you should tweak the code behind that page to validate them and redirect to /category/c1234 where appropriate.
I think the code in index.cfm is looking at the parameter, checking to see if it starts with something recognizable, and going from there. You need to make it more strict.
Alternatively, you could add another .htaccess rule to parse the c1234garbage part and decide which part is valid, and which part (if any) is garbage. I can't give you a regex for that, though, since I don't know the rules for a valid input in your application.
Edit:
I think I found the problem. This part here:
RewriteRule ^[a-z\d\-]+/[a-z]\d+/? /index.cfm/$0 [NC,PT,QSA,E=SEF_REQUEST:true]
You specify the beginning of the relative URL with ^, but you don't specify that you want it to match all the way to the end. So I think what's happening is that it's taking the part of the string that matches, throwing out everything else, and appending it to /index.cfm/. So it takes only the /category/c1234 part from /category/c1234garbage, because that's the part that matches ^[a-z\d\-]+/[a-z]\d+/?.
You can probably fix this with just a word break:
RewriteRule ^[a-z\d\-]+/[a-z]\d+\b/? /index.cfm/$0 [NC,PT,QSA,E=SEF_REQUEST:true]
If that doesn't work, I'm afraid we've reached the end of my htaccess knowledge. I'm more of a regex guy.
Just BTW, this still seems a little awkward. If I understand this right, part of the URL will still get thrown out if it doeesn't fit your exact pattern. E.g. /category/c1234?abc=123 will lose its querystring parameters. You might want to redesign how your rules are set up.
I partially solved the problem. I added
# Remove garbage from after category
RewriteCond %{REQUEST_URI} [a-z\d\-]+/[a-z]\d+(.+)
RewriteRule ^([a-z\d\-]+/[a-z]\d+)/? $1 [R=301]
on top of the SEF rules. It's doing what i want which is to remove the garbage from the url but it gives me an infinite loop because its redirecting even when the url is clean. Any hints?
EDIT: So i realized that the .+ at the end is matching the numbers as well... How do i change it to match anything other than numbers after the numbers? basically where I have the .+ i need to have a "match any character except for numbers"
EDIT: I finally got it to work with the following code:
# Remove garbage from after category
RewriteCond %{REQUEST_URI} [a-z\d\-]+/[a-z]\d+[A-Za-z-.]+
RewriteRule ^([a-z\d\-]+/[a-z]\d+)/? $1 [R=301]
The (.+) i was using previously was reading the 2nd number (c1234)as being part of the . so it would always pass the the condition as true unless it was something like c1