There is set of rules to terminate instance for Auto Scaling when we have multiple AZ.. Same way if we wanted scale up if we have multiple available zones, where exactly instances will be created .. is there any hierarchy?
According to aws docs, if you have multiple availabilty zones for an autoscaling group, aws try to distribute the instance in evenly manner. So if your desired capacity is 8 and there are 4 instances in az-1 and 3 in az-2, the remaining one instance will be created in az-2.
When one Availability Zone becomes unhealthy or unavailable, Amazon EC2 Auto Scaling launches new instances in an unaffected Availability Zone. When the unhealthy Availability Zone returns to a healthy state, Amazon EC2 Auto Scaling automatically redistributes the application instances evenly across all the Availability Zones for your Auto Scaling group. Amazon EC2 Auto Scaling does this by attempting to launch new instances in the Availability Zone with the fewest instances. If the attempt fails, however, Amazon EC2 Auto Scaling attempts to launch in other Availability Zones until it succeeds.
You can read more about this here.
Related
The definition of the vpc_zone_identifier parameter is a list of subnet IDs to launch resources in. Subnets automatically determine which availability zones the group will reside.
So suppose I list eu-west-1a and eu-west-1c for that parameter and a desired capacity of 3.
Is my ASG going to deploy my desired capacity randomly across the AZs (e.g. 2 + 1) or it will deploy 3 per AZ?
There will be only 3 instances distributed across the two AZs if the selected AZs have enough capacity. AWS tries to prioritize high-availability, so it will try to place the instances evenly across the AZs (2+1 in your case). Exact details are:
Amazon EC2 Auto Scaling attempts to distribute instances evenly between the Availability Zones that are enabled for your Auto Scaling group. Amazon EC2 Auto Scaling does this by attempting to launch new instances in the Availability Zone with the fewest instances. If the attempt fails, however, Amazon EC2 Auto Scaling attempts to launch the instances in another Availability Zone until it succeeds. For Auto Scaling groups in a VPC, if there are multiple subnets in an Availability Zone, Amazon EC2 Auto Scaling selects a subnet from the Availability Zone at random.
My AWS solution spans over 3 availability zones. In my backend the user is able to trigger a heavy compute job with beefy px instances. Therefore I wrote a CFN template, which provision all resorucess to execute the compute job (secret store, IAM Role, EC2 instance, log group). However when I try to create the template, it returns with a 500 and states that no capacity for my instance type is available for the availability zone i choose. My template provides a subnet for the EC2 instance and an availability zone for the attached volume. In the end I don't care in which availability zone the ec2 is provisioned as long it is in one of my subnets. Does someone know a way to provision an EC2 instance and it's volume (with cloudforamtion) by not specifically choosing one availability zone, but rather provide a range of subnets/availability zones ?
TLDR:
Does someone know a way to provision an EC2 instance and it's volume (with cloudforamtion) by not specifically choosing one availability zone, but rather provide a range of subnets/availability zones ?
How can I make sure ASG is scaling EC2 instances in a correct Zone sequence, i.e when I scale ASG from 3 instances to 5 instances, it needs to have 2 nodes in Zone-A, 2 in Zone-B and 1 in Zone-C. But in our case, it ends up in 2 nodes in Zone-A, 1 node in Zone-B and 2 nodes in Zone-C.
AWS ASG launches new instances in all Availability zones you enabled for that particular ASG. This is an extract from the official documentation.
Amazon EC2 Auto Scaling attempts to distribute instances evenly between the Availability Zones that are enabled for your Auto Scaling group. Amazon EC2 Auto Scaling does this by attempting to launch new instances in the Availability Zone with the fewest instances. If the attempt fails, however, Amazon EC2 Auto Scaling attempts to launch the instances in another Availability Zone until it succeeds
If you increase the desired capacity to say 9 (and you have 3 AZ's), you'll see there's a high chance there will be 3 instances on each AZ.
There is no way to control which AZ the AutoScaling Group will launch instances in.
The only work around I can think of is that you could make 1 ASG per AZ and then control the desired on your own via a script instead of using a scaling policy. I would recommend trying to make sure your application is as ephemeral as possible without zonal dependencies so that instances can be added in any zone
I have an Auto Scaling Group and I want to stop that instance from Auto Scaling Group rather than terminating, Is it possible to do so?
No. From the official definition:
Auto Scaling is a web service designed to launch or terminate Amazon EC2 instances automatically based on user-defined policies, schedules, and health checks.
When scaling-out, new instances are launched into the Auto Scaling group.
When scaling-in, instances are terminated.
Auto Scaling does not start/start instances.
Some benefits of this approach are:
Instances can be launched in different Availability Zones in case there is a failure in a particular AZ
Failed instances can be easily replaced
There is no limit to the number of instances that could be launched (compared to running out of 'stopped' instances)
Launch Configurations can be updated, so any newly-launched instances will use the new configuration (as opposed to recycling old instances)
Consider the case when an Auto Scaling group is configured to span multiple availability zones (such as in this scenario). When a new Amazon EC2 instance should be added to the scaling group (scale out) based on demand, how does Auto Scaling decide in which availability zone the instance will be placed? The one that has the smaller number of instances?
Thanks for your help.
As you expected, Auto Scaling would indeed select a zone that has the smaller number of instances, section Instance Distribution and Balance Across Multiple Zones within Availability Zones and Regions explains the general algorithm employed by Auto Scaling:
Auto Scaling attempts to distribute instances evenly between the
Availability Zones that are enabled for your Auto Scaling group. Auto
Scaling does this by attempting to launch new instances in the
Availability Zone with the fewest instances. If the attempt fails,
however, Auto Scaling will attempt to launch in other zones until it
succeeds. [emphasis mine]
An Auto Scaling group can also become unbalanced between the zones by various conditions (e.g. active termination of an instance), which can trigger an Auto Scaling rebalancing activity - please check the documentation linked above for more details on this and how edge cases are handled.
Generally its best to scale in such a way that the distribution of instances across zones is even (If you have 3 zones, scaling up would mean adding 3 instances, 1 to each zone). Adding more capacity does not mean traffic will split based on capacity. It will still continue to be round robin.