I have an Application load balancer in production with "CreatedTime": "2020-08-25T13:49:18.510000+00:00",
Trying to setup traffic mirror but get error that I can't as LB created on no nitro instance.
I have an LB on my dev environment created at a later time(2022) and this works.
Can't see anywhere that shows this information. Have tried describe-load-balancers
My question is I would have expected aws to have LB updated in background periodically which doesn't seem the case. So how does one update with no downtime?
I am using AWS Elastic Beanstalk. In there, I selected a Traffic Splitting deploy strategy, with a 100% split (so that 100% of new instances will have the new version and have their health evaluated).
Here's how (according to their documentation) that is supposed to work:
During a traffic-splitting deployment, Elastic Beanstalk creates a new set of instances in a separate temporary Auto Scaling group. Elastic Beanstalk then instructs the load balancer to direct a certain percentage of your environment's incoming traffic to the new instances. Then, for a configured amount of time, Elastic Beanstalk tracks the health of the new set of instances. If all is well, Elastic Beanstalk shifts remaining traffic to the new instances and attaches them to the environment's original Auto Scaling group, replacing the old instances. Then Elastic Beanstalk cleans up—terminates the old instances and removes the temporary Auto Scaling group.
And more specifically:
Rolling back the deployment to the previous application version is quick and doesn't impact service to client traffic. If the new instances don't pass health checks, or if you choose to abort the deployment, Elastic Beanstalk moves traffic back to the old instances and terminates the new ones.
However, it seems silly that it only looks at my internal /health health checks, and not the overall health status of the environment, from the HTTP status codes, that it already has information on.
I tried the following scenario:
Deploy a new version.
As soon as the "health evaluation period" begins, flood the server with error 500s (from an endpoint I made specifically for this purpose).
AWS then moves all my instances into "degraded" state, and "unhealthy", but then seems to ignore it, and goes on anyway.
See the following two log dump screenshots (they are oldest-first).
Is there any way that I can make AWS respect the HTTP status based health checks that it already performs, during a traffic split? Or am I bound to only rely on custom-developed health checks entirely?
Update 1: Even weirder, I tried making my own healthchecks fail always too, but it still decides to deploy the new version with the failed healthcheck!
Update 2: I noticed that the temporary auto scaling group that it creates while assessing health does only have an "EC2" type health check, and not "ELB". I think that might be the root cause. If I could only get it to use "ELB" instead.
That is interesting! I do not know if setting the health check type to "ELB" may do the job because we use CodeDeploy, which has far better rollback capabilities than AWS Elastic Beanstalk.
However, there is a well-document way in the docs [1] to apply the setting you are looking for:
[...] By default, the Auto Scaling group, created for your environment uses Amazon EC2 status checks. If an instance in your environment fails an Amazon EC2 status check, Auto Scaling takes it down and replaces it.
Amazon EC2 status checks only cover an instance's health, not the health of your application, server, or any Docker containers running on the instance. If your application crashes, but the instance that it runs on is still healthy, it may be kicked out of the load balancer, but Auto Scaling won't replace it automatically. [...]
If you want Auto Scaling to replace instances whose application has stopped responding, you can use a configuration file to configure the Auto Scaling group to use Elastic Load Balancing health checks. The following example sets the group to use the load balancer's health checks, in addition to the Amazon EC2 status check, to determine an instance's health.
Example .ebextensions/autoscaling.config
Resources:
AWSEBAutoScalingGroup:
Type: "AWS::AutoScaling::AutoScalingGroup"
Properties:
HealthCheckType: ELB
HealthCheckGracePeriod: 300
It does not mention the new traffic splitting deployment feature, though.
Thus, I cannot confirm this is the actual solution, but at least you can give it a shot.
[1] https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/environmentconfig-autoscaling-healthchecktype.html
Once upon a time I thought that the Immutable Deployment option in Elastic Beanstalk was a holy panacea -- but it only works when a deployment involves no changes to the application's database schema.
We've now resorted to blue-green deployments. However, this only works if you control the DNS. If you are a SaaS solution and you allow customers to create a CNAME then B/G is often a spectacular failure as the enterprise: a) sets a very high TTL, and/or b) their internal DNS or firewalls caches the underlaying IP addresses of the ALB (which are dynamic and, of course, replaced when you swap the URL of the blue and green environments).
Traffic splitting is written as an option in the Elastic Beanstalk documentation.
But it's not actually an option in the configuration section in the console.
This wouldn't be the first time I've seen Elastic Beanstalk's docs are out of date so it could be AWS have removed that feature.
Since AWS introduced CodeStar I suspect Elastic Beanstalk is getting the cold shoulder.
I am hosting a Django site on Elastic Beanstalk. I haven't yet linked it to a custom domain and used to access it through the Beanstalk environment domain name like this: http://mysite-dev.eu-central-1.elasticbeanstalk.com/
Today I did some stress tests on the site which led it to spin up several new EC2 instances. Shortly afterwards I deployed a new version to the beanstalk environment via my local command line while 3 instances were still running in parallel. The update failed due to timeout. Once the environment had terminated all but one instance I tried the deployment again. This time it worked. But since then I cannot access the site through the EB environment domain name anymore. I alway get a "took too long to respond" error.
I can access it through my ec2 instance's IP address as well as through my load balancer's DNS. The beanstalk environment is healthy and the logs are not showing any errors. The beanstalk environment's domain is also part of my allowed hosts setting in Django. So my first assumption was that there is something wrong in the security group settings.
Since the load balancer is getting through it seems that the issue is with the Beanstalk environment's domain. As I understand the beanstalk domain name points to the load balancer which then redirects to the instances? So could it be that the environment update in combination with new instances spinning up has somehow corrupted the connection? If yes, how do I fix this and if no what else could be the cause?
Being a developer and newbie to cloud hosting my understanding is fairly limited in this respect. My issue seems to be similar to this one Elastic Beanstalk URL root not working - EC2 Elastic IP and Elastic IP Public DNS working
, but hasn't helped me further
Many Thanks!
Update: After one day everything is back to normal. The environment URL works as previously as if the dependencies had recovered overnight.
Obviously a server can experience downtime, but since the site worked fine when accessing the ec2 instance ip and the load balancer dns directly, I am still a bit puzzled about what's going on here.
If anyone has an explanantion for this behaviour, I'd love to hear it.
Otherwise, for those experiencing similar issues after a botched update: Before tearing out your hair in desperation, try just leaving the patient alone overnight and let the AWS ecosystem work its magic.
I have an Elastic Beanstalk app running on Docker set up with autoscaling. When another instance is added to my environment as a result of autoscaling, it will 502 while the instance goes through the deployment process. If I ssh into the relevant box, I can see (via docker ps) that docker is in the process of setting itself up.
How can I prevent my load balancer from directing traffic to the instance until after the instance deployment has actually completed? I found this potentially related question on SuperUser, but I think my health check URL is set-up properly -- I have it set-up to point at the root of the domain, which definitely 502s when I navigate to it in my browser, so I suspect that's not the cause of my problem.
I have created two instances in AWS (one is Live & other is Backup). My website is hosted on Live Instance. what I want to do is, if Live instance Status check fails, then it should switch to Backup Instance. Is there any automated process to achieve the same?
Is not a good idea to keep one instance idle and pay for it. Put them under an Elastic Load Balancer and start using both of them. The ELB health check will automatically remove the instances that stop working. Then you can continue to monitor the number of healthy instances under your ELB with AWS Cloud Watch and setup an alert - to get an email when something happens or you can even autoscale
Liviu Costea's answer is the best way to go. If you insist on only keeping one active server at a time, and you are using Route53 for your DNS, then you can use Route 53 Health Checks to switch the domain name resolution from your primary server to your secondary server in the case that your primary server goes out of service.