Enabling https in Drupal behind AWS elastic load balancer - amazon-web-services

I was having a really bad time trying to get our drupal site to run in full https behind an AWS load balancer using Apache and mod_rewrite. The ELB is acting as the SSL certificate provider. All traffic to the ELB should be encrypted, then the traffic to the EC2 instances is normal HTTP (pretty standard).
I attempted all sorts of .htaccess and Apache conf.d/*.conf mod_rewrite conditions and rules. When I was able to it to redirect traffic to https, it would break the ELB's health checks, bringing my "unhealthy" EC2 instance out of the pool. If I tried to fix it so the ELB health checks would pass, I'd get an infinite redirect problem.
After a week or so of working on this on and off, I finally found a solution. If you're having the same issue, please look here! It might not work 100% for you, but at least I may be able to shed some light on how to go about fixing it.

Well here's my answer for a site that I want ALL traffic directed to https://example.com. (If you want https://www.example.com, you can make a few tweaks)
First off, Drupal's settings.php file at /sites/default/settings.php:
I have the following in this file:
$base_url = '//example.com';
$conf['reverse_proxy'] = TRUE;
$conf['reverse_proxy_addresses'] = array('name-of-my-loadbalancer.us-west-2.elb.amazonaws.com');
$conf['reverse_proxy_header'] = 'HTTP_X_CLUSTER_CLIENT_IP';
To be honest, I don't know if the above "reverse_proxy" settings are actually necessary. In fact, I have disabled them and it doesn't seem to affect anything so it might not be. The important part is to make sure you have the $base_url = '//example.com'; in your settings.php file.
The next part is configuring your .htaccess file. Here are the bits that are important:
RewriteCond %{HTTP:X-Forwarded-Proto} !https
RewriteCond %{HTTPS} off
RewriteCond %{REQUEST_URI} !=/healthy.html
RewriteRule ^ https://example\.com%{REQUEST_URI} [L,R=301]
For a noob like me, this was tough to figure out at first but here's the breakdown:
RewriteCond %{HTTP:X-Forwarded-Proto} !https This looks at the
protocol being sent by the load balancer. If the protocol is NOT
https, initiate the RewriteRule.
RewriteCond %{HTTPS} off If traffic is headed to the site that is not HTTPS, initiate the RewriteRule
RewriteCond %{REQUEST_URI} !=/healthy.html this is an important bit. I have a simple healthy.html file that contains the word "Success!" within my main drupal webroot directory for Apache. When the healthy.html file is accessed by the ELB, it will bypass our rewrite rule. If it didn't the ELB health check would fail, taking our server(s) offline.
RewriteRule ^ https://example\.com%{REQUEST_URI} [L,R=301] Here is the actual rewrite rule. If all of the above conditions pass then this will rewrite the incoming URL to https://example.com/whatever. By the way, the L stands for "Last," as in "this is the last rule of this set" and the "R=301" stands for "301 Redirect."
The only time this doesn't do a proper redirect is if I manually type in https://www.example.com (with the https at the beginning). I think I can fix that with another simple RewriteCond.

In case anyone like me land over here with Drupal 9 and hosted within AKS cluster, if you are using ingress add following annotation in ingress.
appgw.ingress.kubernetes.io/backend-hostname: "example.com"
after adding this line at ingress and applying it to AKS
echo $_SERVER['HTTP_HOST'];
will print
example.com
as your new host, that should solve Drupal base_url issue.

Related

Adding a second ec2 instance to target group causing "too many redirects" error (redirect loop)

I'm trying to set up Cloudfront->Application Elastic Load Balancer->Auto Scaling->EC2 AWS stack.
Everything works until it scales to more than 1 EC2 instance, which then causes a redirect loop with the error message "Too many redirects".
Here are the related settings:
I've enabled an ACM SSL certificate and attached it to the CloudFront distribution.
DNS pointed to CloudFront domain name.
Cloudfront 'Origin Protocol Policy' = HTTP Only
ELB Listener 1 = HTTP:80 redirects to HTTPS:443
ELB Listener 2 = HTTPS:443 forwards to the target group of 2 EC2 instances
.htaccess
RewriteEngine On
RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^app\.php(?:/(.*)|$) %{ENV:BASE}/$1 [R=301,L]
RedirectMatch 302 ^/$ /app.php/
Please help me solve this redirect loop and explain why the current settings are not working.
Any time you spend on this is highly appreciated.
You appear to be using both mod_rewrite and RedirectMatch to perform two different redirects:
This appears to redirect any request starting with app.php to the base website URL:
RewriteRule ^app\.php(?:/(.*)|$) %{ENV:BASE}/$1 [R=301,L]
This appears to be redirecting any request to / coming in to /app.php/:
RedirectMatch 302 ^/$ /app.php/
These rules seem to be in direct conflict with one another. If you try to request either the root website path /, or /app.php you are going to get into a redirect loop.
This condition tells Apache to track redirects internally in order to prevent a redirect loop:
RewriteCond %{ENV:REDIRECT_STATUS} ^$
However that only works as long as you have one server. When you have multiple load-balanced servers they can't track if a redirect has been issued by another server in the pool.
I suggest taking a look at these redirect rules and only using one of them depending on what your specific needs are.
I was able to resolve this, temporarily, with Mark B's answer.
"You didn't include any info about logging in and user sessions in your question. For the short term, I would enable sticky sessions on the load balancer. Long term I would look into a distributed session store." – Mark B

Redirect all urls to new domain but some specific urls

Good morning at all. I have a WordPress website and I want to redirect all urls to new domain but:
http://domain.it/?page_id=3668
http://domain.it/?team={name}-{surname}
I wrote this code in the htaccess file
#RewriteCond %{QUERY_STRING} !^team=([a-z-]+)$
#RewriteCond %{QUERY_STRING} !^page_id=3668$
#RewriteRule ^(.*)$ https://newdomain.it/ [L,R=301]
but it does not work correctly. In the Network tab of the Firefox developer tools, I see that there are some resources that are loaded from newdomain.it (for example css and images).
What I'm doing wrong?
This probably is what you are looking for:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^old\.example\.com$
RewriteCond %{QUERY_STRING} ^page_id=3668$ [OR]
RewriteCond %{QUERY_STRING} ^team=\w+-\w+$
RewriteRule ^ - [END]
RewriteRule ^/?(.*)$ https://new.example.com/$1 [R=301]
Is allows the two domains being served by the same http server, but that is not a requirement. If you operate two separate http servers then these rules belong into the one serving the old domain, obviously.
It is a good idea to start out with a 302 temporary redirection and only change that to a 301 permanent redirection later, once you are certain everything is correctly set up. That prevents caching issues while trying things out...
In case you receive an internal server error (http status 500) using the rule above then chances are that you operate a very old version of the apache http server. You will see a definite hint to an unsupported [END] flag in your http servers error log file in that case. You can either try to upgrade or use the older [L] flag, it probably will work the same in this situation, though that depends a bit on your setup.
This implementation will work likewise in the http servers host configuration or inside a dynamic configuration file (".htaccess" file). Obviously the rewriting module needs to be loaded inside the http server and enabled in the http host. In case you use a dynamic configuration file you need to take care that it's interpretation is enabled at all in the host configuration and that it is located in the host's DOCUMENT_ROOT folder.
And a general remark: you should always prefer to place such rules in the http servers host configuration instead of using dynamic configuration files (".htaccess"). Those dynamic configuration files add complexity, are often a cause of unexpected behavior, hard to debug and they really slow down the http server. They are only provided as a last option for situations where you do not have access to the real http servers host configuration (read: really cheap service providers) or for applications insisting on writing their own rules (which is an obvious security nightmare).

Setting up AWS Beanstalk with Name.Com DNS - URL redirection doesn't work

I have a website www.example.com and it is hosted on elastic-beanstalk. I am using the name.com DNS servers. I have followed the steps in the following blogs to set up https and URL settings:
https://colintoh.com/blog/map-custom-domain-to-elastic-beanstalk-application
https://medium.com/#jbesw/tutorial-adding-https-to-a-custom-domain-on-elastic-beanstalk-29a5617b8842
i.e
Create a CNAME pointing www.example.com to the beanstalk
Add a URL redirect for #.example.com to https://www.example.com
After this, the links www.example.com works, and http://example.com gets redirected to www.example.com.
But for a page inside the site, like www.example.com/about, just typing in http://example.com/about does not work and does not get redirected to www.example.com/about.
Most blogs suggest moving to AWS Route 53. Is that the only option?
The issue, as you've found out, is that DNS-level redirects don't work on a page-specific level. At least, not without some extra magic happening in the background (which some registrars implement.)
Even if that setup did work, you'd still have some SEO issues to deal with. For example, you want the example.com > www.example.com redirect to (In any case I know of) to be a 301 redirect. This let's search engines like Google know "Use only the www version of this page please." Otherwise, you effectively have two pages floating around out there either of which (or both) could be indexed and considered duplicate content of one another.
Using the Route 53 servers is certainly an option but no one you have to use. The issue is that you need to do this on a server-level—not a DNS level.
On a server level, you can specify more complex and granular redirection rules such as "send any non-www, non-https traffic to the www, https version of the page and indicate this is a permanent preference (301)` that redirect (on an Apache server) would look like this:
RewriteEngine On
RewriteCond %{HTTPS} !=on [OR]
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^(.*)$ https://www.{HTTP_HOST}%{REQUEST_URI} [R=301,L,NE]
Quick Reference: NC means case-insensitive matching. R specifies the type of redirect (301 here), NE specifies to not escape characters like # or ? which are used in many URL schemes. For a full list of flags used during Apache RewriteRules, read this webpage.
There are different ways to achieve this for Apache, NGINX, and Windows Server. Amazon has a reference article detailing some of the implementation approaches for this. Copying the details of the article here is beyond the scope of your question IMO.
So, to answer your question: Route 53 isn't your only option. You can absolutely use whatever registrar or DNS host you'd like. The issue is that you need to re-think your approach entirely and focus on server-level rules rather than DNS-level rules. I'm no expert and find it annoying to do it this way, so hopefully, someone will jump in with a more insightful approach.

How do I disable the ${appName}.elasticbeanstalk.com URL?

Here's my scenario:
I have an Amazon EB application
I use a third party DNS/cache/attack-protection service (Cloudflare) instead of Route 53
Problem:
Search engines are (also) crawling and indexing my ${appName}.elasticbeanstalk.com URL
Q: How do I disable the ${appName}.elasticbeanstalk.com URL for good to only use my chosen (.com) name?
I will answer with the best thing I've found so far, just to make sure I can help other people.
Assuming there is no way to completely disable the elasticbeanstalk URL, best thing I found was to add an entry to .htaccess file redirecting.
# Redirect elastic beanstalk addresses to www.example.com
RewriteCond %{HTTP_HOST} elasticbeanstalk\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
You can do something better to consider your testing environments as well.
Couldn't you set a rule on your server to instruct search engines not to crawl that particular url (robots.txt, etc.)?

How to enable SSL for a whole django site with apache2+ubuntu 11?

I need to enable SSL for one of my entire django site. Currently the site is hosted with Apache2 in Ubuntu 11.1 and just accessible through http. I'd like to know the following,
1) Apache configuration for enabling ssl for this site.
2) Django related changes of the same, if any.
Another question of the same kind is unanswered, so asking here again.
You may do it by adjusting your apache config like this:
# Turn on Rewriting
RewriteEngine on
# Apply this rule If request does not arrive on port 443
RewriteCond %{SERVER_PORT} !443
# RegEx to capture request, URL to send it to (tacking on the captured text, stored in $1), Redirect it, and Oh, I'm the last rule.
RewriteRule ^(.*)$ https://www.x.com/dir/$1 [R,L]
Note that this is taken from https://serverfault.com/questions/77831/how-to-force-ssl-https-on-apache-location.
There shouldnt be any changes necessary for django.
HTH.