I have two Scale-out rules:
Scale-out-rule-1: Add 1 instance if YARNMemoryAvailablePercentage is less than 15 for 1 five-minute period with a cooldown of 300 seconds.
Scale-out-rule-2: Add 5 instance if ContainerPendingRatio is greater than 0.75 for 1 five-minute period with a cooldown of 300 seconds.
Here if both scenarios are matching,
does it process both rules? Any order?
if the only one rule processed then which one and why?
Appreciate comments on similar scanario for scale in(cluster scale down).
Q 1) does it process both rules? any order?
Only one rule will be processed when both rules triggered at same time, EC2 Auto Scaling chooses the policy that provides the largest capacity.
In your case "Scale-out-rule-2" will be processed as it adds 5 instances and "Scale-out-rule-1" will be suspended.
Reference: https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scale-based-on-demand.html#multiple-scaling-policy-resolution
Q 2) if the only one rule processed then which one and why?
Explained above
I'd like to share my finding that I have learned
=== Two rules ====
Scaling-out Rule1:
Add 1 instance if YARNMemoryAvailablePercentage is < 15 for 1 five-minute period with a cooldown of 300 seconds.
Scaling-out Rule2:
Add 5 instance if ContainerPendingRatio is > .75 for 1 five-minute period with a cooldown of 300 seconds.
EMR cluster internally use "Amazon EC2 Auto Scaling" because EMR Instance group is also a group with EC2 instance fleet.
[1] https://docs.aws.amazon.com/autoscaling/ec2/userguide/what-is-amazon-ec2-auto-scaling.html
Consequently, its scaling out/in behavior follows that of "Amazon EC2 Auto Scaling". According to the doc[2], when these situations occur, Amazon EC2 Auto Scaling ( attached the EMR instance group ) chooses the policy that provides the largest capacity for both scale out and scale in. In this case, "ContainerPendingRatio" rule will be triggered because it adds 5 instances. You can find more details/reason in the doc[2]
[2] Multiple Scaling Policies
https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scale-based-on-demand.html#multiple-scaling-policy-resolution
I had experiments after creating EMR cluster in my account, and I see the same result as expected.
I hope this helps you.
I dont think it is possible there is no such thing available in amazon docs.
Related
In my AWS elastic server setup, I have configured 4 Alarms
Add instance when CPU utilization > 20
Add instance when TargetResponceTime > 0.9
Remove instance when CPU utilization < 20
Remove instance when TargetResponceTime < 0.9
What will happen if two or more alarms triggered together?
For Example
If alarm 1 and 2 triggered together will it add two instances?
If alarm 1 and 4 triggered together will it remove an instance and add one or will it stay neutral?
The alarms are working fine, but I want to understand the mechanism behind alarm action execution.
Any Idea?
Your auto scaling group has a cooldown period, so technically multiple actions cannot occur at the same time. The next action would occur after the cooldown period has passed.
This functionality is to stop exactly what you're talking about, with multiple instances scaling at once.
I think personally for what you're doing you should be making use of a composite CloudWatch alarm. By having an OR condition these 4 alarms could become 2, which would reduce the number of alarms you have to trigger an autoscaling action.
What is the difference between AWS ASG cooldown period (which I can edit when I want to update my autoscaling group) and the warmup period in the scaling policy?
cooldowns prevent runaway scaling events. If your system is running high on CPU and your auto scaling rule adds an instance, it is going to take 5 minutes or so before the instance is fully spun up and helping with the load. Without a cooldown, the rule would keep firing and might add 4 or 5 instances before the CPU metrics came down, resulting in wasteful over-provisioning. Or in the scale down case, overshoot and result in under-provisioning.
with a cooldown period in place, the Auto Scaling group launches an
instance and then suspends scaling activities due to simple scaling
policies or manual scaling until the specified time elapses. (The
default is 300 seconds.) This gives newly launched instances time to
start handling application traffic. After the cooldown period expires,
any suspended scaling actions resume. If the CloudWatch alarm fires
again, the Auto Scaling group launches another instance, and the
cooldown period takes effect again. If, however, the additional
instance was enough to bring the CPU utilization back down, then the
group remains at its current size.
Cooldown
Instance Warmup
Warm-up value for Instances allows you to control the time until a newly launched instance can contribute to the CloudWatch metrics, so when warm-up time has expired, an instance is considered to be a part Auto Scaling group and will receive traffic.
With step scaling policies, you can specify the number of seconds that
it takes for a newly launched instance to warm up. Until its specified
warm-up time has expired, an instance is not counted toward the
aggregated metrics of the Auto Scaling group. While scaling out, AWS
also does not consider instances that are warming up as part of the
current capacity of the group. Therefore, multiple alarm breaches that
fall in the range of the same step adjustment result in a single
scaling activity. This ensures that we don't add more instances than
you need.
as-scaling-target-tracking
I determined two key differences and one observation of warm-up:
warm-up applies only to scale out
cooldown applies to both scale in and scale out
warm-up applies to all scaling policies
cooldown applies only to simple scaling policies
warm-up of a step-scaling policy only inhibits scale out of the number of instances it follows on from
Scaling according to the first step, from 10->11 will trigger a warm up period
During this period metrics within the first range can be ignored entirely
A metric in a second step, say adding 3 instances, will only trigger adding 3-1 = 2 instances instead, whilst warm-up is in effect
I have one scale in simple policy in my autoscaling group which is based on CPU Utilization.
The policy looks like:
Execute :
When CPUUtilization < 50 for 5 consecutive periods of 60 seconds
Action :
Remove 10 percent of group
Cooldown time:
600 seconds before allowing another scaling activity
Now I would like to add a more aggressive simple policy, saying if CPUUtilization is less than 35 for 5 minutes, remove 20% of the group.
The goal is
When 35 < CPU Utilization < 50 for 5 minutes, remove 10% of the group
When CPU Utilization < 35 for 5 minutes, remove 20% of the group
The problem is I cannot use scaling policy with steps since the cooldown time is not supported which could make my asg scaling in until the min instances.
And if I have both simple policies, they are obviously conflict. I don't really know which policy will be triggered first if it reachs CPUUtilization < 35.
Does anyone have a workaround of this one?
Thanks.
You would certainly need to use Scaling Policy with Steps to be able to specify multiple rules for the scaling policy. While it doesn't allow the specification of a Cooldown period, it should work fine. I recommend you try it and monitor/test the system.
By the way, you have a very aggressive policy. It is not typically a good idea to scale-in based upon only 5 minutes of data. Amazon EC2 is charged in hourly increments, so you might be thrashing (adding and removing instances very quickly), which is not economical. It is typically recommended to scale-out quickly (to respond to user demand) but scale-in slowly (since there's really no rush).
When you setup an Auto Scaling groups in AWS EC2 Min and Max bounds seem to make sense:
The minimum number of instances to scale down to based on policies
The maximum number of instances to scale up to based on policies
However, I've never been able to wrap my head around what the heck Desired is intended to affect.
I've always just set Desired equal to Min, because generally, I want to pay Amazon the minimum tithe possible, and unless you need an instance to handle load it should be at the Min number of instances.
I know if you use ElasticBeanstalk and set a Min to 1 and Max to 2 it sets a Desired to 2 (of course!)--you can't choose a value for Desired.
What would be the use case for a different Desired number of instances and how does it differ? When you expect AWS to scale lower than your Desired if desired is larger than Min?
Here are the explanations for the "min, desired and max" values from AWS support:
MIN: This will be the minimum number of instances that can run in your
auto scale group. If your scale down CloudWatch alarm is triggered,
your auto scale group will never terminate instances below this number
DESIRED: If you trip a CloudWatch alarm for a scale up event, then it
will notify the auto scaler to change it's desired to a specified
higher amount and the auto scaler will start an instance/s to meet
that number. If you trip a CloudWatch alarm to scale down, then it
will change the auto scaler desired to a specified lower number and
the auto scaler will terminate instance/s to get to that number.
MAX: This will be the maximum number of instances that you can run in
your auto scale group. If your scale up CloudWatch alarm stays
triggered, your auto scale group will never create instances more than
the maximum amount specified.
Think about it like a sliding range UI element.
With min and max, you are setting the lower bound of your instance scaling. Withe desired capacity, you are setting what you'd currently like the instance count to hover.
Example:
You know your application will have heavy load due to a marketing email or product launch...simply scale up your desired capacity beforehand:
aws autoscaling set-desired-capacity --auto-scaling-group-name my-auto-scaling-group --desired-capacity 2 --honor-cooldown
Source
"Desired" is (necessarily) ambiguous.
It means the "initial" number of instances. Why not just "initial" then? Because the number may change by autoscaling events.
So it means "current" number of instance. Why not just "current" then? Because during an autoscaling event, instances will start / terminate. Those instances do not count towards "current" number of instances. By "current", a user expects instances that are operate-able.
So it means "target" number of instance. Why not just "target" then? I guess "target" is just as good (ambiguous) as "desired"...
When you expect AWS to scale lower than your Desired if desired is
larger than Min?
This happens when you set a CloudWatch alarm based on some AutoScaling policy. Whenever that alarm is triggered it will update the DesiredCount to whatever is mentioned in config.
e.g., If an AutoScalingGroup config has Min=1, Desired=3, Max=5 and there is an Alarm set on an AutoScalingPolicy which says if CPU usage is <50% for consecutive 10 mins then Remove 1 instances then it will keep reducing the instance count by 1 whenever the alarm is triggered until the DesiredCount = MinCount.
Lessons Learnt: Set the MinCount to be > 0 or = DesiredCount. This will make sure that the application is not brought down when the mincount=0 and CPU usage goes down.
In layman's terms, DesiredCapacity value is automatically updated on scale-in and scale-out events.
In other words,
Scale-in or Scale-out are done by decreasing or increasing the DesiredCapacity value.
Desired capacity simply means the number of instances that will come up / fired up when you launch the autoscaling. That means if desired capacity = 4, then 4 instances will keep on running until and unless any scale up or scale down event triggers. If scale up event occurs, the number of instances will go up till maximum capacity and if scale down event occurs it will go down till the minimum capacity.
Correct me if wrong, thanks.
I noticed that desired capacity went down and no new instance came up when
I set one of the instances to standby. It kept on running but was detached from ELB ( requests were not forwarded to that particular instance when accessed via ELB DNS ). No new instance has been initiated by AWS. Rather desired capacity was decreased by 1.
When I changed the state of instance ( from standby ) the instance was again attached to ELB ( the instance started to get requests when accessed via ELB DNS ). The desired capacity was increased by 1 and became 2.
Hence it seems no of instances attached to ELB can't cross the threshold limit set by min and max but the desired capacity is adjusted or changed automatically based on the occurrence of scale in or scale out event. It was definitely something unknown to me.
It might be a way to let AWS know that this is the desired capacity required for the respective ELB at a given point in time.
Min and max is self explanatory but desired was confusing until i have attached Target Tracking Auto scaling policy with the ASG where CPU utilization was the target metric. Here, desired instances were scaled out and scaled in based on target CPU utilization. If any desired count are placed through cloudformation/manual, for time being ASG will create same number of instances as desired count. But later ASG policy will automatically adjust the desire instances based on target CPU utilization.
Desired is what we start initially. It will go to min or max depending on the scale-in / scale-out.
I liked the analogy with a slider to understand this - https://stackoverflow.com/a/36272945/10779109
Think of min and max as the maximum allowed brightness on a screen. You probably don't want to min to be 0 in that case (sidenote). The desired quantity keeps changing based on the env (in the case of ASG, it depends on the scaling policies).
For instance, if the following check runs every hour, this is where desired quantity is required.
if low_load(<CPU or Mem etc>) and desired_capacity>= min_capacity:
desired_capacity = desired_capacity-1
Max capacity can also be understood in the same way where you'd want to keep increasing the desired quantity based on a cloudwatch_alarm (or any scaling policy) up to the max capacity.
I am planning on using AWS Autoscaling to scale my EC2 services, I have 4 policies that need to control my instance behavior, 2 for scale out and 2 for scale in. My question is what order will they be evaluated in? Scale out first then scale in? or vice-versa? Random? or something else?
Thank you,
Policies are not evaluated in an order. Each policy is compared against the metrics that policy is set up to measure, and takes actions based on the results.
For example, perhaps you have the following four policies:
Add 1 instance when an SQS queue depth is > 1000 messages
Remove 1 instance when the same SQS queue depth is < 200 messages
Add 1 instance when the average CPU of all instances in the autoscaling group is > 80%
Remove 1 instance when the average CPU of all instances in the autoscaling group is < 30%
As you can see, ordering doesn't make sense in this context. The appropriate action(s) will be executed whenever the conditions are met.
Note that without planning and testing you can encounter loops of instances that constantly cycle up and down. Drawing from the previous example, imagine that a new instance is launched because there are > 1000 messages in the queue. But the CPU usage is only 20% for all the instances, so then the 4th policy fires to remove an instance. Thus all the policies should be considered in concert.