Redirect everything, except one subdomain - regex

Everything has to redirect to www.domain.com. except for test.domain.com. Which will host a new version of the site for testing.
both of the domains need to look within their web directory.
I've searched stack overflow but none of the similar questions seem to provide a working solution for me. Probably because I don't understand htacces / regex that well yet.
This is the current content of my .htacces file.
RewriteEngine on
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]
RewriteCond %{REQUEST_URI} !web/
RewriteRule (.*) /web/$1 [L]

Try this for your second line (where test is your subdomain):
RewriteCond %{HTTP_HOST} !^(www|test)\.
The ! negates the match.
The brackets are a regex group.
The pipe symbol is an or.
What this should say is "match me all subdomains except www. & test."
Regex101 link here which might help explain further and gives my test data:
https://regex101.com/r/5udsER/1/
Disclaimer:
I wrote this afk so this is lacking an end-to-end test but should work.

Related

Replacing whitespaces in querystring with .htaccess

I have in Google hundreds if not thousands of URLs that have the name of the product in it. My new e-commerce now replaces the whitespaces with hyphens when constructs the URL and I need to make an .htaccess to automatically redirect the old URLs to the new ones by replacing the whitespaces with hyphens.
The example URL I'm using is
detalle.php?titulo=Zapatillas%20Salomon%20Xr%20Mission&codigo=040-8800-072
but the number of whitespaces to be replaces can vary widely.
The last iteration of rules that I have tried is:
RewriteCond %{QUERY_STRING} ^(.*)[\s|%20](.*)&codigo=(.*)
RewriteRule detalle.php detalle.php?%1-%2&codigo=%3 [N=20]
In a tester I found online this only replaces the last whitespace and let the others without change, in my development server not even that.
I have spent almost a day with this and going nowhere, even when acording to Apache documentation this should work.
Thanks in advance.
Edit:
The solution given by #anubhava
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(.*?)(?:\+|%20|\s)+(.+?)\sHTTP [NC]
RewriteRule ^ /%1-%2 [L,NE,R=302]
worked as requested, but somehow broke the lines in my .htacces that previously had been working perfectly (minus the whitespaces)
RewriteCond %{QUERY_STRING} ^titulo=(.*)&codigo=(.*)$
RewriteRule detalle.php http://otherdomain/%1--det--%2? [R=301,L]
this is to transform the URLs with parameters into "friendly" URLs.
Edit2:
There was some kind of problem in the development server because it was in a subdirectory, I tried it on the production server and everything worked fine so I'm accepting the answer.
I put this edit just in case someone else have a similar situation.
You can use this rule as first rule in your root .htaccess to convert all spaces by hyphens.
RewriteEngine In
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s/+(.*?)(?:\+|%20|\s)+(.+?)\sHTTP [NC]
RewriteRule ^ /%1-%2 [L,NE,R=302]

Cannot undertand the mixed outcomes of an .htaccess re-write rule / regex

I have a simple website comprised of one page with a div that gets populated with ajax content based on the links the user selects. This site is running on an Apache server with an .htaccess file in the domain's root directory. Requests to www.mydomain.com are directed to scripts/index.php while requests for dynamic content (but not resource files) are directed to the same .php script with the requested content passed as a parameter (e.g., www.mydomain.com/myProject will be rewritten as scripts/index.php?dynContent=myProject).
My rewrite rules are below and for the most part they are performing those described tasks properly; however, I've encountered some URLs that do not match the second condition even though I would expect them to -- though this is the first time I've had to write rules for an .htaccess file so I don't really know what I'm talking about... A good example of a URL that fails the second condition is www.mydomain.com/about, but I've encountered many more just by testing random words/letters.
Can you tell me why www.mydomain.com/about fails the second condition? Also, if there is a more elegant way to achieve the objectives I described above, I would love to learn about it. Thank you!!
RewriteCond %{HTTP_HOST} ^(www.)?mydomain.com$ [NC]
RewriteRule ^(/)?$ scripts/index.php [L]
RewriteCond %{REQUEST_URI} .*[^index.php|.css|.js|.jpg|.html|.swf]$
RewriteRule .* scripts/index.php?dynContent=$1 [L]
This is because regex in your 2nd rules is incorrect.
Change your code to:
RewriteCond %{HTTP_HOST} ^(www\.)?mydomain\.com$ [NC]
RewriteRule ^(/)?$ scripts/index.php [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_URI} !\.(php|css|js|jpe?g|html|swf)$
RewriteRule ^(.*)$ scripts/index.php?dynContent=$1 [L]

RewriteRule: ^ vs ^(.*)$ vs ^.*$ Is there a Difference?

What is the difference in using ^ vs ^(.*)$ vs ^.*$ as wildcards in a RewriteRule?
My goal is to redirect http://carnarianism.com/ (anything) to the landing (default) page of http://carnarian.com/. I have found the following solutions, which all seem to work, so I wonder which is better for performance?
RewriteRule ^ http://carnarian.com/ [R=301,L]
RewriteRule ^.*$ http://carnarian.com/ [R=301,L]
RewriteRule ^(.*)$ http://carnarian.com/ [R=301,L]
All of these seem to work okay. This is my very first post on StackOverflow, most of the time I can find an answer just searching for it.
To be clear: ABOVE the questioned RewriteRule in my .htaccess is a RewriteCond and WWW Handler as follows:
RewriteEngine On
RewriteBase /
# FROM www. --TO-- NO www. See no-www.org
RewriteCond %{HTTP_HOST} ^www\.(.+)$ [NC]
RewriteRule ^(.*)$ http://%1/$1 [R=301,L]
RewriteCond %{HTTP_HOST} carnarianism\.com$ [NC]
########## The Above Questioned RewriteRule ??? ##########
RewriteRule ^ http://carnarian.com/ [R=301,L]
Note: I started this search with the following, but I did not want the following because the path was also passed, and I want it to go to the landing page only. Therefore, I know you need the parentheses to be able to use the $1 variable. I do not want the $1 variable.
RewriteRule ^(.*)$ http://carnarian.com/$1 [R=301,L]
^ makes none of the original URL accessible as backreferences. $0 is an empty string.
^.*$ makes the entire original URL accessible as the $0 backreference (so you can do e.g. http://example.com/oldurl.php?url=$0)
^(.*) makes the entire original URL accessible as both the $0 and $1 backreferences; it's usually used when you want to actually use the old URL in the replacement since it's more explicit about the use.
All of them match the same thing, but produce different backreference groups.
The one that is better performance wise is the one you have benchmarked yourself.
But since you are using a .htaccess file rather than having this configuration in the server directly (maybe via a VirtualHost?) which is parsed only once, it really doesn't matter. Parsing .htaccess files at every single request is much more time consuming than performing the regular expression by a factor of thousands.
If you care about performance you should never ever use .htaccess files and even disable their parsing with: AllowOverride None. Not disabling them, and having a request like: http://example.com/sites/css/theme/main.css Apache will still try to load all the following files:
.htaccess
sites/.htaccess
sites/css/.htaccess
sites/css/theme/.htaccess
It will generate system calls even if those file does not exist.
Trying therefore to improve your RewriteRule in an .htaccess file is like sneezing in the ocean in the hope of making it less salty. :)
Now, if you improved your setup to use server configuration and to answer your original question: ^.*$ might be more efficient than ^(.*)$ as less references needs to be created. Chance is high, however, that you can't measure it.

.htaccess help: RewriteRule entire website

I recently launched a website into it's production environment.
This entire website will be in this folder structure:
/root/v1/website.com/index.php
The help I need is with .htaccess. When I'm upgrading an environment I require 0 downtime, so I want to make the next version of the website in a folder named:
/root/v2/website.com/index.php and available to switch over immediately.
basically "flip a switch", by sending all traffic to the corresponding folder in the current version.
So for example, right now, I would like all traffic that goes from http://www.website.com/cookies/aregreat.php to be opened at:
/root/v1/website.com/cookies/aregreat.php
This would apply to images, js and css files too
And then I can obviously change the version from inside the .htaccess and the rest will work itself out.
I'm not familiar with RewriteRule and i'm not too great with regex, the closest I've got to solving the problem is:
RewriteEngine on
RewriteRule (.*) /../v1.0.0/$1
Which is probably totally wrong. Is this even possible?
All help is welcome.
Many Thanks,
Dan
If you really want 'all traffic' to use the new version, I would not use an .htacess file for this, but a symbolic link. In root you'd have one link 'released' that points to v1. Create it like this
ln -s /root/v1 /root/released
Point your vhost at released. When you want to switch, do a one line command:
rm -rf /root/released; ln -s /root/v2 root/released
How about configuring it in your apache virtual host configuration (DocumentRoot)?
If you need to do this in .htaccess, try this:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^domainname.com$ [NC,OR]
RewriteCond %{HTTP_HOST} ^www.domainname.com$
RewriteCond %{REQUEST_URI} !folderV2/
RewriteRule (.*) /folderV2/$1 [L]
Note: I haven't tested this, I'm not 100% sure that it will work.
In an htaccess file in your document root (would be n /root/.htaccess):
RewriteEngine On
RewriteCond %{REQUEST_URI} !^/v[0-9.]+/website.com
RewriteRule ^(.*)$ /v1/website.com/$1 [L]
You'd just need to change the v1 to whatever version you have and save the file and you'd flipped the switch.
Unfortunately this doesn't work, but this is the type of solution that I would like. Would it be possible to rewrite this to work on subdomains too?
Yeah, you can match against the %{HTTP_HOST} and backreference using the % symbol:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^(www\.)?(.*)$ [NC]
RewriteCond %{REQUEST_URI} !^/v[0-9.]+/%2
RewriteRule ^(.*)$ /v1/%2/$1 [L]
Here, %2 is going to be whatever matches after an optional www.. So if the URL is:
http://www.domain.com/cookies/aregreat.php
Then you should get rewritten to /v1/domain.com/cookies/aregreat.php, and:
http://cakes.domain.com/chocolate/cake.php
should get rewritten to /v1/cakes.domain.com/chocolate/cake.php, etc.

RewriteCond in .htaccess with negated regex condition doesn't work?

I'm trying to prevent, in this case WordPress, from rewriting certain URLs. In this case I'm trying to prevent it from ever handling a request in the uploads directory, and instead leave those to the server's 404 page. So I'm assuming it's as simple as adding the rule:
RewriteCond %{REQUEST_URI} !^/wp-content/uploads/
This rule should evaluate to false and make the chain of rules fail for those requests, thus stopping the rewrite. But no... Perhaps I need to match the cover the full string in my expression?
RewriteCond %{REQUEST_URI} !^/wp-content/uploads/.*$
Nope, that's not it either. So after scratching my head I do a check of sanity. Perhaps something is wrong with the actual pattern. So I make a simple test case.
RewriteCond %{REQUEST_URI} ^/xyz/$
In this case, the rewrite happens if and only if the requested URL is /xyz/ and shows the server's 404 page for any other page. This is exactly what I expected. So I'll just stick in a ! to negate that pattern.
RewriteCond %{REQUEST_URI} !^/xyz/$
Now I'm expecting to see the exact opposite of the above condition. The rewrite should not happen for /xyz/ but for every other possible URL. Instead, the rewrite happens for every URL, both /xyz/ and others.
So, either the use of negated regexes in RewriteConds is broken in Apache, or there's something fundamental I don't understand about it. Which one is it?
The server is Apache2.
The file in its entirety:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/wp-content/uploads/
RewriteRule . /index.php [L]
</IfModule>
WordPress's default file plus my rule.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_URI} !^/wp-content/uploads/ [OR]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
So, after a lot of irritation, I figured out the problem, sort of. As it turned out, the rule in my original question actually did exactly what it was supposed to. So did a number of other ways of doing the same thing, such as
RewriteRule ^wp-content/uploads/.*$ - [L]
(Mark rule as last if pattern matches) or
RewriteRule ^wp-content/uploads/.*$ - [S=1]
(Skip the next rule if pattern matches) as well as the negated rule in the question, as mentioned. All of those rules worked just fine, and returned control to Apache without rewriting.
The problem happened after those rules were processed. Instead, the problem was that I deleted a the default 404.shtml, 403.shtml etc templates that my host provided. If you don't have any .htaccess rewrites, that works just fine; the server will dish up its own default 404 page and everything works. (At least that's what I thought, but in actual fact it was the double error "Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.")
When you do have a .htaccess, on the other hand, it is executed a second time for the 404 page. If the page is there, it will be used, but now, instead the request for 404.shtml was caught by the catch-all rule and rewritten to index.php. For this reason, all other suggestions I've gotten here, or elsewhere, have all failed because in the end the 404 page has been rewritten to index.php.
So, the solution was simply to restore the error templates. In retrospect it was pretty stupid to delete them, but I have this "start from scratch" mentality. Don't want anything seemingly unnecessary lying around. At least now I understand what was going on, which is what I wanted.
Finally a comment to Cecil: I never wanted to forbid access to anything, just stop the rewrite from taking place. Not that it matters much now, but I just wanted to clarify this.
If /wp-content/uploads/ is really the prefix of the requested URI path, your rule was supposed to work as expected.
But as it obviously doesn’t work, try not to match the path prefix of the full URI path but only the remaining path without the contextual per-directory path prefix, in case of the .htaccess file in the document root directory the URI path without the leading /:
RewriteCond $0 !^wp-content/uploads/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .+ /index.php [L]
If that doesn’t work neither, it would certainly help to get some insight into mod_rewrite’s rewriting process by using its logging feature. So set RewriteLogLevel to a level of at least 4, make your request and take a look at the entries in the log file specified with RewriteLog. There you can see how mod_rewrite handles your request and with RewriteLogLevel greater or equal to 4 you will also see the values of variables like %{REQUEST_URI}.
I have found many examples like this when taking a "WordPress First" approach. For example, adding:
ErrorDocument 404 /error-docs/404.html
to the .htaccess file takes care of the message ("Additionally, a 404 Not Found error...").
Came across this trying to do the same thing in a Drupal site, but might be the same for WP since it all goes through index.php. Negating index.php was the key. This sends everything to the new domain except old-domain.org/my_path_to_ignore:
RewriteCond %{REQUEST_URI} !^/my_path_to_ignore$
RewriteCond %{REQUEST_URI} !index.php
RewriteCond %{HTTP_HOST} ^old-domain\.org$ [NC]
RewriteRule ^(.*)$ http%{ENV:protossl}://new-domain.org/$1 [L,R=301]