How do you put up a maintenance page for AWS when your instances are behind an ELB? - amazon-web-services

How do you put up a maintenance page in AWS when you want to deploy new versions of your application behind an ELB? We want to have the ELB route traffic to the maintenance instance while the new auto-scaled instances are coming up, and only "flip over" to the new instances once they're fully up. We use auto-scaling to bring existing instances down and new instances, which have the new code, up.
The scenario we're trying to avoid is having the ELB serve both traffic to new EC2 instances while also serving up the maintenance page. Since we dont have sticky sessions enabled, we want to prevent the user from being flipped back and forth between the maintenance-mode page and the application deployed in an EC2 instance. We also can't just scale up (say from 2 to 4 instances and then back to 2) to introduce the new instances because the code changes might involve database changes which would be breaking changes for the old code.

I realise this is an old question but after facing the same problem today (December 2018), it looks like there is another way to solve this problem.
Earlier this year, AWS introduced support for redirects and fixed responses to Application Load Balancers. In a nutshell:
Locate your ELB in the console.
View the rules for the appropriate listener.
Add a fixed 503 response rule for your application's host name.
Optionally provide a text/plain or text/html response (i.e. your maintenance page HTML).
Save changes.
Once the rule propagates to the ELB (took ~30 seconds for me), when you try to visit your host in your browser, you'll be shown the 503 maintenance page.
When your deployment completes, simply remove the rule you added.

The simplest way on AWS is to use Route 53, their DNS service.
You can use the feature of Weighted Round Robin.
"You can use WRR to bring servers into production, perform A/B testing,
or balance your traffic across regions or data centers of varying
sizes."
More information in AWS documentations on this feature
EDIT: Route 53 recently added a new feature that allows DNS Failover to S3. Check their documentation for more details: http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover.html

Came up with another solution that's working great for us. Here are the required steps to get a simple 503 http response:
Replicate your EB environment to create another one, call it something like app-environment-maintenance, for instance.
Change the configuration for autoscaling and set the min and max servers both to zero. This won't cost you any EC2 servers and the environment will turn grey and sit in your list.
Finally, you can use the AWS CLI to now swap the environment CNAME to take your main environment into maintenance mode. For instance:
aws elasticbeanstalk swap-environment-cnames \
--profile "$awsProfile" \
--region "$awsRegion" \
--output text \
--source-environment-name app-prod \
--destination-environment-name app-prod-maintenance
This would swap your app-prod environment into maintenance mode. It would cause the ELB to throw a 503 since there aren't any running EC2 instances and then Cloudfront can catch the 503 and return your custom 503 error page, should you wish, as described below.
Bonus configuration for custom error pages using Cloudfront:
We use Cloudfront, as many people will for HTTPS, etc. Cloudfront has error pages. This is a requirement.
Create a new S3 website hosting bucket with your error pages. Consider creating separate files for response codes, 503, etc. See #6 for directory requirements and routes.
Add the S3 bucket to your Cloudfront distribution.
Add a new behavior to your Cloudfront distribution for a route like /error/*.
Setup an error pages in Cloudfront to handle 503 response codes and point it to your S3 bucket route, like /error/503-error.html
Now, when your ELB thorws a 503, your custom error page will be displayed.
And that's it. I know there are quite a few steps to get the custom error pages and I tried a lot of the suggested options out there including Route53, etc. But all of these have issues with how they work with ELBs and Cloudfront, etc.
Note that after you swap the hostnames for the environments, it takes about a minute or so to propagate.

Route53 is not a good solution for this problem. It takes a significant amount of time for DNS entries to expire before the maintenance page shows up (and then it takes that same amount of time before they update after maintenance is complete). I realize that Lambda and CodeDeploy triggers did not exist at the time this question was asked, but I wanted to let others know that Lambda can be used to create a relatively clean solution for this, which I have detailed in a blog post:
http://blog.ajhodges.com/2016/04/aws-lambda-setting-temporary.html
The jist of the solution is to subscribe a Lambda function to CodeDeploy events, which replaces your ASG with a micro instance serving a static page in your load balancer during deployments.

As far as I could see, we were in a situation where the above answers didn't apply or weren't ideal.
We have a Rails application running the Puma with Ruby 2.3 running on 64bit Amazon Linux/2.9.0 that seems to come with a (classic) ELB.
So ALB 503 handling wasn't an option.
We also have a variety hardware clients that I wouldn't trust to always respect DNS TTL, so Route53 is risky.
What did seem to work nicely is a secondary port on the nginx that comes with the platform.
I added this as .ebextensions/maintenance.config
files:
"/etc/nginx/conf.d/maintenance.conf":
content: |
server {
listen 81;
server_name _ localhost;
root /var/app/current/public/maintenance;
}
container_commands:
restart_nginx:
command: service nginx restart
And dropped a copy of https://gist.github.com/pitch-gist/2999707 into public/maintenance/index.html
Now to set maintenance I just switch my ELB listeners to point to port 81 instead of the default 80. No extra instances, s3 buckets or waiting for clients to fresh DNS.
Only takes maybe ~15s or so for beanstalk (probably mostly waiting for cloudformation in the back-end) to apply.

Our deployment process first runs a cloudformation to spun up a ec2 micro instance (Maintenance instance) which copies pre-defined static page from s3 onto the ec2. Cloudformation is supplied with elb's to which micro ec2 instance is attached. Then a script (powershell or cli) is run to remove web instances (ec2) from elb's leaving Maintenance instance.
This way we switch to maintenance instance during deployment process.
In our case, we have two elb's, one for external and the other internal. Our internal elb's will not be updated during this process and is how we have post prod deployment smoke test is done.
Once testing is done, we run another script to attach web instances back to elb's and delete the Maintenance stack.

Related

Setting up Latency Routing in AWS

I've been digging in the AWS docs for ages and am at my wits end trying to find non AWS official examples.
How do I decide if I should have failover and latency routing or should I have both? I currently have the site on Elastic beanstalk with both a dev and production version, but I get a 500 or 502 errors at least a couple times a month where if you refresh the page, it eventually loads but then the CSS is missing or the page doesn’t load and sometimes the page is just slow to load even with caching. How am I supposed to know if it’s a need for failover or latency routing, or should I have both? The AWS notifications only say “Environment health has transitioned from Degraded to Severe”. How do I log where/which AWS server Route 53 had serve the page?
Are you supposed to have multiple EC2 instances for latency based routing? I’m confused why the docs say to create a latency record for each of my EC2 instances.
https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/TutorialTransitionToLBR.html
I currently have Codepipeline connected to my Github, so that changes are automatically deployed to the dev site, and then I manually approve changes to production. If I have multiple EC2 instances, do I need to set up the code pipeline for each EC2 instance such that it’s connected to my Github and then manually approve changes for all instances—ie would I just have multiple copies of the site hosted in diff regions in this situation? How do people manage this? I’m assuming there’s some way to approve production launch for all at once if this is what is done but I don't know what to google

HAproxy vs ALB or any other load balancer which one to use?

We are looking to separate our blog platform to a separate ec2 server (In Nginx) for better performance and scalability.
Scenario is:
Web request (www.example.com) -> Load Balancer/Route -> Current EC2 Server
Blog request (www.example.com/blog) -> Load Balancer/Route -> New Separate EC2 Server for blog
Please help in this case what is the best option to use:
Haproxy
ALB - AWS
Any other solution?
Also, is it possible to have the load balancer or routing mechanism in a different AWS region? We are currently hosted in AWS.
Haproxy
You would have to set this up on an EC2 server and manage everything yourself. You would be responsible for scaling this correctly to handle all the traffic it gets. You would be responsible for deploying it to multiple availability zones to provide high availability. You would be responsible for installing all security updates on the operating system.
ALB - AWS
Amazon will automatically scale this out to handle any amount of traffic you get. Amazon will handle all security patches of the underlying system. Amazon provides free SSL certificates for ALBs. Amazon will deploy this automatically across multiple availability zones to provide high availability.
Any other solution?
I think AWS Global Accelerator would work here as well, but you would have to weigh the differences between Global Accelerator and ALB to decide which fits your use case and budget the best.
You could also look at placing a CDN in front of everything, like CloudFront or Cloudflare.
Also, is it possible to have the load balancer or routing mechanism in
a different AWS region?
AWS Global Accelerator would be the thing to look at if load balancing in different regions is a concern for you. Given the details you have provided I'm not sure why you would want this however.
Probably what you really need is a CDN in front of your websites, with or without the ALB.
Scenario is:
Web request (www.example.com) -> Load Balancer/Route -> Current EC2
Server Blog request (www.example.com/blog) -> Load Balancer/Route ->
New Separate EC2 Server for blog
In my view you can use ALB deployed in multi AZ for high availability for the following reasons :-
aws alb allows us to route traffic based on various attributes and path in URL is one of them them.
https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-listeners.html#rule-condition-types
With aws ALB you can have two target groups with instance handling traffic one for first path (www.example.com) and second target group for another path (www.example.com/blog).
ALB allows something called SNI (which allows to handle multiple certications behind a single alb for multiple domains), so all you need to do is set up single https listener and upload your certificates https://aws.amazon.com/blogs/aws/new-application-load-balancer-sni/
i have answered on [something similar] it might help you also
This is my opinion, take it as that. I am sure a lot of people wont agree.
If your project is small or personal, you can go with HAProxy (Cheap USD4 or less if you get a t3a as a spot instance) Or free if you place it inside another EC2 of yours may be using docker.
If your project is not personal or not small, go with ALB (Expensive but simpler and better integrated to other AWS stuff)
HAProxy can handle tons of connections, but you have to do more things by yourself. ALB can also handle tons of connections and AWS will do most of the work.
I think HAProxy is more suitable for personal/small projects because if your project doesnt grow, then you dont have to touch HAProxy. It is set and forget the same as ALB but cost less.
You usually wont mind about Availability zones or disaster tolerance in a personal project, so HAProxy should be easy to config.
Another consideration: AWS offers a free tier on ALB, so if your project will run for less than a year ALB is the way to go.
If you are learning, then ALB should be considered because real clients usually love to stick to AWS in all aspects, and HAProxy is your call and also your risk (just to reduce cost for a company that usually pays a lot more for your salary, so not worth the risk).

I want to gradually move from Heroku to AWS. How to do setup "Weighted routing policy" in Route 53?

This problem is hurting my brains for almost the whole weekend. I hope someone will come and release me :-)
I want to move a webapplication from Heroku to AWS in a gradual way. So, i.e. that we start routing 10% of the request to AWS, and increase that number in time – when our canary tests passed and everything runs smoothly. FYI; the database is already moved to AWS and can also be accessed by Heroku via a Network Load Balancer.
The setup should also be able to serve a maintenance-page (running from s3 bucket with cloudfront) when – in some hopefully rare case – the health checks for both are failing. I've added an extra alias record for that with a weight of 0, because route53 will always try to give a result when all checks are failing, even if the weight is set to nil.
The Application Load Balancer we need for routing all the traffic to the correct ECS containers, also arrange some redirects (apex to www, and http to https) for us.
With all these requirement, I came up with the diagram shown below.
During implementation, I run into a problem that I do not get solved.
I can't create an specific A-Record (the one with weight 100), because it tries to refer to an recordset as alias which is from another type (CNAME). And that's not allowed within Route 53.
The problem is that it has to be a A-record, because when you want to leverage the 'weighting routing policy', all dns record should be from the same type.
The records with weight 90 and 10 should also be CNAME's (the need to be from the same type as well), because I can't use a A-record for my Heroku endpoint.
Anyone have an idea how to solve this? Our maybe knows a better way to do this?

How to handle canary releases on AWS elasticbeanstalk?

I have previously seen it done by having one EC2 instance running HAProxy, configured via a json file/lambda function, that in turn controlled the traffic with sticky sessions, into two separate elasticbeanstalk applications. So we have two layers of load balancing.
However, this has a few issues, one being: Testing several releases becomes expensive, requires more and more EB applications.
By canary release, I mean, being able to release to only a percentage of traffic, to figure out any errors that escaped the devs, the review process, and the QA process, without affecting all traffic.
What would be the best way to handle such a setup with AWS resources and not break the bank? :)
I found this Medium article that explain the usage of passive autoscaling group where you deploy the canary version into it and monitor for statistics. Once you are satisfied with the result, you can change the desired count for the canary autoscaling group to 0, and perform rolling upgrade to the active autoscaling group.
Here is the link to the article: https://engineering.klarna.com/simple-canary-releases-in-aws-how-and-why-bf051a47fb3f
The way you would achieve canary testing with elastic beanstalk is by
Create a 2nd beanstalk environment to which you deploy the canary release
Use a Route53 Weighted routing policy to send a percentage of the DNS requests to your canary environment.
If you're happy with the performance of the canary you can then route 100% of the traffic to the canary env, etc.
Something to keep in mind with DNS routing is, that the weighted routing is not an exact science since clients cache DNS based on the TTL you set in Route53. In the extreme scenario where you would have e.g. only one single client calling your beanstalk environment (such as a a single web server) and the TTL is set to 5 minutes, it could happen that the switching between environments only happens every 5 minutes.
Therefore for weighted routing it is recommended to use a fairly low TTL value. Additionally having many clients (e.g. mobile phones) works better in conjunction with DNS routing.
Alternatively it might be possible to create a separate LB in front of the two beanstalk environments that balances requests between the beanstalk environments. However I'm not 100% sure if a LB can sit in front other (beanstalk) LBs. I suspect the answer is not but I haven tried yet.
Modifying the autoscaling group in elastic beanstalk is not possible, since the LB is managed by beanstalk and beanstalk can decide to revert the changes you did manually on the LB. Additionally beanstalk does not allow you to deploy to a subset of instances while keeping the older version on another subset.
Hope this helps.
Traffic splitting is supported natively by Elastic Beanstalk.
Be sure to select a "high availability" config preset when creating your application environment (by clicking on "configure more options"), as this will configure a load balancer for your env:
Then edit the "Rolling updates and deployments" section of your environment and choose "Traffic splitting" as your deployment strategy.

How can I get useful load testing data for my AWS server?

I have a system set up on AWS where I have a set of ec2 insatnces (as an application server from an elastic beanstalk) running in an auto-scaling load-balanced environment. All this works fine.
I would like to load test this instance in order to obtain results that help me to figure out what more needs to be done to the system in order for it to handle, potentially, millions of users. I have used a tool called Locust (http://locust.io) so far to do this. This allows me to send requests to my instance(s?) through a proxy as desired. However, I cannot tell whether the requests are being routed to multiple instances or the same one constantly; and if they are being load balanced appropriately I can't see how many requests each of the ec2 instances are receiving or their health under load. (I have a feeling that the requests are not being properly load balanced as the failure rate always seems to increase drastically at a similar point every test run.)
Is there a way to get this information inside from the AWS ec2 or elastic beanstalk consoles, or is there a better distributed web based load testing tool that can provide the data I need?
There are two ways to get this information
1) Create S3 Bucket and save ELB logs. You can filter these logs to check which instance is serving your request
2) Retrieve application level logs : If apache/nginx installed on your EC2 instances to serve the request. Filter apache/nginx logs in every machine
Hope it helps !!
There is a way to get this data from the AWS console.
Inside the elastic beanstalk console there is a tab titled health. This tab (in the enhanced health overview) shows the number of requests per second, the response for the requests, the latency, the load average and the CPU utilisation for each ec2 instance being run by the elastic beanstalk.
An example of this data is shown in the following image.
This data allows the system manager to see which of their back-end instances are receiving requests and how many they are each being sent through a load-balancer and a proxy.
This can also be attained from the AWS CLI using:
eb health environment_name