I wonder if there is a simple way or best practices on how to ensure all instances within an AutoScaling group have been launched with the current launch-configuration of that AutoScaling group.
To give an example, imagine an auto-scaling group called www-asg with 4 desired instances running webservers behind an ELB. I want to change the AMI or the userdata used to start instances of this auto-scaling group. So I create a new launch configuration www-cfg-v2 and update www-asg to use that.
# create new launch config
as-create-launch-config www-cfg-v2 \
--image-id 'ami-xxxxxxxx' --instance-type m1.small \
--group web,asg-www --user-data "..."
# update my asg to use new config
as-update-auto-scaling-group www-asg --launch-configuration www-cfg-v2
By now all 4 running instances still use the old launch configuration. I wonder if there is a simple way of replacing all running instances with new instances to enforce the new configuration, but always ensure that the minimum of instances is kept running.
My current way of achieving this is as follows..
save list of current running instances for given autoscaling group
temporarily increase the number of desired instances +1
wait for the new instance to be available
terminate one instance from the list via
as-terminate-instance-in-auto-scaling-group i-XXXX \
--no-decrement-desired-capacity --force
wait for the replacement instance to be available
if more than 1 instance is left repeat with 4.
terminate last instance from the list via
as-terminate-instance-in-auto-scaling-group i-XXXX \
--decrement-desired-capacity --force
done, all instances should now run with same launch config
I have mostly automated this procedure but I feel there must be some better way of achieving the same goal. Anyone knows a better more efficient way?
mathias
Also posted this question in the official AWS EC2 Forum.
Old question I know but I thought I would share my approach.
I change the launch config for an ASG, I then launch the same number of instances as are currently in the ASG, as they become available (automated testing) they are attached to the ASG. once the machines have been added our deployment system updates our varnish loadbalancer(s) to use the new instances and the old instances are terminated.
All of the above is automated and a full site scale switch takes about 5 minutes depending on the launch time.
incase you are wondering, we use SNS to handle updating varnish when instances are added or removed or in the case of our loadbalancers scaling (which almost never happens) the deployment system will update our route53 config instead.
I think that pretty much covers everything
This isn't a lot different, but you could:
create the new LC
create a new ASG using the new LC
scale down the old ASG
delete the old asg and LC
I do deployments this way, and it's in my experience to roll from one ASG to another, rather than having to jump back and forth. But as I noted, it's not a huge difference.
It might be worth looking at: https://github.com/Netflix/asgard , which is a Netflix OSS tool for managing autoscaling groups. I ended up not using it, but it's pretty interesting nonetheless.
Related
I have created a cluster to run our test environment on Aws ECS everything seems to work fine including zero downtime deploy, But I realised that when I change instance types on Cloudformation for this cluster it brings all the instances down and my ELB starts to fail because there's no instances running to serve this requests.
The cluster is running using spot instances so my question is there by any chance a way to update instance types for spot instances without having the whole cluster down?
Do you have an AutoScaling group? This would allow you to change the launch template or config to have the new instances type. Then you would set the ASG desired and minimum counts to a higher number. Let the new instance type spin up, go into service in the target group. Then just delete the old instance and set your Auto scaling metrics back to normal.
Without an ASG, you could launch a new instance manually, place that instance in the ECS target group. Confirm that it joins the cluster and is running your service and task. Then delete the old instance.
You might want to break this activity in smaller chunks and do it one by one. You can write small cloudformation template as well because by default if you update the instance type then your instances will be restarted and to avoid zero downtime, you might have to do it one at a time.
However, there are two other ways that I can think of here but both will cost you money.
ASG: Create a new autoscaling group or use the existing one and change the launch configuration.
Blue/Green Deployment: Create the exact set of resources but this time with updated instance type and use Route53's weighted routing policy to control the traffic.
It solely depends upon the requirement, if you can pour money then go with above two approaches otherwise stick with the small deployments.
I have one infra that use amazon elastic beanstalk to deploy my application.
I need to scale my app adding some spot instances that EB do not support.
So I create a second autoscaling from a launch configuration with spot instances.
The autoscaling use the same load balancer created by beanstalk.
To up instances with the last version of my app, I copy the user data from the original launch configuration (created with beanstalk) to the launch configuration with spot instances (created by me).
This work fine, but:
how to update spot instances that have come up from the second autoscaling when the beanstalk update instances managed by him with a new version of the app?
is there another way so easy as, and elegant, to use spot instances and enjoy the benefits of beanstalk?
UPDATE
Elastic Beanstalk add support to spot instance since 2019... see:
https://docs.aws.amazon.com/elasticbeanstalk/latest/relnotes/release-2019-11-25-spot.html
I was asking this myself and found a builtin solution in elastic beanstalk. It was described here as follows:
Add a file under the .ebextensions folder, for our setup we’ve named the file as spot_instance.config (the .config extension is
important), paste the content available below in the file
https://gist.github.com/rahulmamgain/93f2ad23c9934a5da5bc878f49c91d64
The value for EC2_SPOT_PRICE, can be set through the elastic beanstalk environment configuration. To disable the usage of spot
instances, just delete the variable from the environment settings.
If the environment already exists and the above settings are updates, the older auto scaling group will be destroyed and a new one
is created.
The environment then submits a request for spot instances which can be seen under Spot Instances tab on the EC2 dashboard.
Once the request is fulfilled the instance will be added to the new cluster and auto scaling group.
You can use Spot Advisor tool to ascertain the best price for the instances in use.
A price point of 30% of the original price seems like a decent level.
I personally would just use the on-demand price for the given instance type given this price is the upper boundary of what you would be willing to pay. This reduces the likelihood of being out-priced and thus the termination of your instances.
This might be not the best approach for production systems as it is not possible to split between a number of on-demand instances and an additional number of spot instances and there might be a small chance that there are no spot instances available as someone else is buying the whole market with high bids.
For production use cases I would look into https://github.com/AutoSpotting/AutoSpotting, which actively manages all your auto-scaling groups and tries to meet the balance between the lowest prices and a configurable number or percentage of on-demand instances.
As of 25th November 2019, AWS natively supports using Spot Instances with Beanstalk.
Spot instances can be enabled in the console by going to the desired Elastic Beanstalk environment, then selecting Configuration > Capacity and changing the Fleet composition to "Spot instance enabled".
There you can also set options such as the On-Demand vs Spot percentage and the instance types to use.
More information can be found in the Beanstalk Auto Scaling Group support page
Here at Spotinst, we were dealing with exactly that dilemma for our customers.
As Elastic Beanstalk creates a whole stack of services (Load Balancers, ASG’s, Route 53 access point etc..) that are tied together, it isn’t a simple task to manage Spots within it.
After a lot of research, we figured that removing the ASG will always be prone to errors as keeping the configuration intact gets complex. Instead, we simply replicate the ASG and let our Elastigroup and the ASG live side by side with all the scaling policies only affecting the Elastigroup and the ASG configuration updates feeding there as well.
With the instances running inside Elastigroup, you achieve managed Spot instances with full SLA.
Some of the benefits of running your Spot instances in Elastigroup include:
1) Our algorithm makes live choices for the best Spot markets in terms of price and availability whenever new instances spin up.
2) When an interruption happens, we predict it about 15 minutes in advance and take all the necessary steps to ensure (and insure) the capacity of your group.
3) In the extreme case that none of the markets have Spot availability, we simply fall back to an on-demand instance.
Since AWS clearly states that Beanstalk does not support spot instances out-of-the-box you need to tinker a bit with the thing. My customer wanted mixed environment (on-demand + spot) and full spot. What I created for my customer was the following (I had access to GUI only):
For the mixed env:
start the env with regular instance;
copy the respective launch configuration and chose spot instances during the process;
edit Auto Scaling Group and chose the lc you just edited + be sure to change Termination Policy to NewestInstance.
Such setup will allow you to have basic on-demand fleet (not-terminable) + some extra spots if required, e.g., higher-than-usual traffic. Remember that if you terminate the environment and recreate it then all of your edits will be removed.
For full spot env:
similar steps as before with one difference - terminate the running instance and wait for ASG to launch a new one. If you want it to do without downtime, just give an extra instance for the Desired number, wait for it to launch and then terminate on-demand one.
We have a terraform deployment that creates an auto-scaling group for EC2 instances that we use as docker hosts in an ECS cluster. On the cluster there are tasks running. Replacing the tasks (e.g. with a newer version) works fine (by creating a new task definition revision and updating the service -- AWS will perform a rolling update). However, how can I easily replace the EC2 host instances with newer ones without any downtime?
I'd like to do this to e.g. have a change to the ASG launch configuration take effect, for example switching to a different EC2 instance type.
I've tried a few things, here's what I think gets closest to what I want:
Drain one instance. The tasks will be distributed to the remaining instances.
Once no tasks are running in that instance anymore, terminate it.
Wait for the ASG to spin up a new instance.
Repeat steps 1 to 3 until all instances are new.
This works almost. The problem is that:
It's manual and therefore error prone.
After this process one of the instances (the last one that was spun up) is running 0 (zero) tasks.
Is there a better, automated way of doing this? Also, is there a way to re-distribute the tasks in an ECS cluster (without creating a new task revision)?
Prior to making changes make sure you have the ASG spanned across multiple availability zones and so are the containers. This ensures High Availability when instances are down in one Zone.
You can configure an update policy of Autoscaling group with AutoScalingRollingUpgrade where you can set MinInstanceInService and MinSuccessfulInstancesPercent to a higher value to maintain slow and safe rolling upgrade.
You may go through this documentation to find further tweaks. To automate this process, you can use terraform to update the ASG launch configuration, this will update the ASG with a new version of launch configuration and trigger a rolling upgrade.
Here's what I have in AWS:
Application ELB
Auto Scaling Group with 2 instances in different regions (Windows IIS servers)
Launch Config pointing to AMI_A
all associated back end stuff configured (VPC, subnets, security groups, ect)
Everything works. However, when I need to make an update or change to the servers, I am currently manually creating a new AMI_B, creating a new LaunchConfig using AMI_B, updating the AutoScalingGroup to use the new LaunchConfig, increasing min number of instances to 4, waiting for them to become available, then decreasing the number back to 2 to kill off the old instances.
I'd really love to automate this process. Amazon gave me some links to CLI stuff, and I'm able to script the AMI creation, create the LaunchConfig, and update the AutoScalingGroup...but I don't see an easy way to script spinning up the new instances.
After some searching, I found some CloudFormation templates that look like they'd do what I want, but most do more, and it's a bit confusing to me.
Should I be exploring CloudFormation? Is there a simple guide I can follow to get started? Or should I stay with the scripting I have started?
PS - sorry if this is a repeated question. Things change frequently at AWS, so sometimes the older responses may not be the current best answers.
You have a number of options to automate the process of updating the instances in an Auto Scaling Group to a new or updated Launch Configuration:
CloudFormation
If you do want to use CloudFormation to manage updates to your Auto Scaling Group's instances, refer to the UpdatePolicy attribute of the AWS::AutoScaling::AutoScalingGroup Resource for documentation, and the "What are some recommended best practices for performing Auto Scaling group rolling updates?" page in the AWS Knowledge Center for more advice.
If you'd also like to script the creation/update of your AMI within a CloudFormation resource, see my answer to the question, "Create AMI image as part of a cloudformation stack".
Note, however, that CloudFormation is not a simple tool- it's a complex, relatively low-level service for orchestrating AWS resources, and migrating your existing scripts to it will likely take some time investment due to its steep learning curve.
Elastic Beanstalk
If simplicity is most important, then I'd suggest you evaluate Elastic Beanstalk, which also supports both rolling and immutable updates during deployments, in a more fully managed, console-oriented, platform-as-a-service environment. Refer to my answer to the question, "What is the difference between Elastic Beanstalk and CloudFormation for a .NET project?" for further comparisons between CloudFormation and Elastic Beanstalk.
CodeDeploy
If you want a solution for updating instances in an auto-scaling group that you can plug into existing scripts, AWS CodeDeploy might be worth looking into. You install an agent on your instances, then trigger deployments through the API/CLI/Console and it manages deploying application updates to your fleet of instances. See Deploy an Application to an Auto Scaling Group Using AWS CodeDeploy for a complete tutorial. While CodeDeploy supports 'in-place' deployments and 'blue-green' deployments (see Working With Deployments for details), I think this service assumes an approach of swapping out S3-hosted application packages onto a static base AMI rather than replacing AMIs on each deployment. So it might not be the best fit for your AMI-swapping use case, but perhaps worth looking into anyway.
You want a custom Termination policy on the Auto Scaling Group.
OldestLaunchConfiguration. Auto Scaling terminates instances that have the oldest launch configuration. This policy is useful when you're updating a group and phasing out the instances from a previous configuration.
To customize a termination policy using the console
Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
On the navigation pane, choose Auto Scaling Groups.
Select the Auto Scaling group.
For Actions, choose Edit.
On the Details tab, locate Termination Policies. Choose one or more
termination policies. If you choose multiple policies, list them in
the order that you would like them to apply. If you use the Default
policy, make it the last one in the list.
Choose Save.
On the CLI
aws autoscaling update-auto-scaling-group --auto-scaling-group-name my-asg --termination-policies "OldestLaunchConfiguration"
https://docs.aws.amazon.com/autoscaling/latest/userguide/as-instance-termination.html
We use Ansible's ec2_asg module for that purpose. There are replace_all_instances and replace_batch_size settings for that purpose. Per documentation:
In a rolling fashion, replace all instances that used the old launch configuration with one from the new launch configuration.
It increases the ASG size by C(replace_batch_size), waits for the new instances to be up and running.
After that, it terminates a batch of old instances, waits for the replacements, and repeats, until all old instances are replaced.
Once that's done the ASG size is reduced back to the expected size.
If you provide target_group_arns, module will check for health of instances in target groups before going to next batch.
Edit: in order to maintain desired number of instances, we first set min to desired.
Goal: To maintain the minimum startup period for bringing up instances to load balance and reduce the troubleshooting time.
Approach:
Create base custom-AMI for ec2-instances
Update/rebundle the custom AMI on every release and s/w patches (code & software updates related to the healthy running instance).
2.a. Packer/any CI usage for update is possible? If so, how? (unable to find a step-by-step approach in documentations of package)
Automate the step 1 and step 2 using chef.
Integrate this AMI in the Auto scaling group (experimented this).
Map the Load balancer to ASG [done].
Maintain the desired count of Instances by bringing up instances from updated-AMI in ASG with LB upon failure.
Crux: Terminate the unhealthy instance and bring up the healthy instance with ami asap.
--
P.S:
I have gone through many posts from [http://blog.kik.com/2016/03/09/using-packer-io-to-optimize-and-manage-ami-creation/] and https://alestic.com/.
Using docker is rolled out of discussion.
But still unable to figure out a clear way to do it.
The simplest way to swap out a new AMI in an existing ASG is to update the launch config and then one by one kill any instance using the old AMI ID. The ASG will bring up new instances as needed, which should use the new AMI. If you want to get fancier (like keeping old instances alive for quick rollback) check out tools like Spinnaker which brings up each new AMI as a new corresponding ASG and then remaps the ELB to swap traffic over if no problems are detected, and then later on when you are sure the deploy is good it kills off the old ASG and all associated instances.