WHY does AWS separate 2 different parameters: cooldown for simple scaling, warmup for step scaling? - aws-asg

I searched & found good explanation about what AWS autoscaling timer cooldown & warmup are but I would like to more understand WHY we separate different timers for each type of scaling policy - cooldown for simple scaling, warmup for step scaling
In other words: what is the problem if step scaling uses only cooldown timer? Conversely, in what situation, simple scaling use timer differently with only WarmUp (NOT cooldown timer as pre-defined)?

Related

AWS ASG target tracking an ECS took 15 minutes to scale-in after the desired tasks of ECS is 0

I have an ECS on AWS which uses a capacity provider. The ASG associated with the capacity provider is responsible to scale-out and scale-in EC2 instances based on the ECS desired task count of ECS. It is worth mentioning that the desired task is managed by a lambda function and updated based on some metrics (calculate the depth of an SQS and based on that, change the desired task of ECS).
Scaling-out is happening almost immediately (without considering the provisioning and pending time) but when the desired task is set to zero in ECS (By lambda function), it takes at least 15 minutes for ASG to turn off the instances. Sinec we are using high performance EC2 types with large numbers, this scaling-in time costs a lot of money to us. I want to know is there any way to reduce this cooldown time to a minutes?
P.S: I have set the default cooldown to 120 but it didn't change anything

AWS Auto Scaling scales Instances down too slow?

I have 2 metrics for my Instances attached to my Load balancer and Auto Scaling Group, one to scale instances up based on CPU Utilization and one to scale down.
My scaling up one works fine whereas the scaling down Instances when under 40% CPU Util seems to work by the alarm being "In Alarm" and it having <40% CPU, it removes 1 Instance as it should but after that it doesn't react taking more than 1 Instance down and I have left it like this for up to 7minutes with just the 1 instance being taken down.
Any idea why it might be doing this?
You can set below to 180 minutes
you can also set the below 2 settings. but ensure these 2 must be the same
Health check grace period -> this tells auto calling that after EC2 instance lunch, please wait for x minute so that ec2 instance can setup everything
Default cooldown -> it waits for 300 seconds before triggering the next check

GCP managed instance group won't scale to zero

I have a GCP managed instance group that I want to scale out and in between 0 and 1 instances using a cron schedule.
GCP has a limitation that says:
Scaling schedules can only be used for MIGs that have at least one other type of autoscaling signal, such as a signal for scaling based on average CPU utilization, load balancing serving capacity, or Cloud Monitoring metrics.
So I must specify an additional autoscaling signal.
The documentation goes on to suggest a workaround:
to scale based only on a schedule, you can set your CPU utilization target to 100%.
So I did. But then the managed group does not scale in to 0, it just stays at 1.
I've not used the Scale-in controls, so the only thing AFAICT that can prevent scale in is the 10 minute Stabilization period, which I have accounted for.
My autoscaler configuration:
{
"name":"myname",
"target":"the/url",
"autoscalingPolicy":{
"minNumReplicas":0,
"maxNumReplicas":1,
"scalingSchedules":{
"out":{
"minRequiredReplicas":1,
"schedule":"0,20,40 * * * *",
"durationSec":300,
"description":"scale out"
}
},
"cpuUtilization":{
"utilizationTarget":1
}
}
}
The schedule itself sets 5 minutes of scale-out to 1 instance, and then there are 10 minutes of stabilization, and then scale in to 0 should happen, but it does not.
If I use the same configuration, but only change maxNumReplicas=2 and minRequiredReplica=2, the autoscaler does scale in and out at the expected times, but between 1 and 2 instances. I think this means the schedule itself is fine.
My theory is that cpuUtilization signal prevents scaling in to 0. Is there a way I could scale between 0 and 1 on a schedule? perhaps another signal, not cpuUtilization?
Thanks!
You are allowing auto scaling after CPU utilization reaches 100% (Autoscaling Policy). Because of that performance will be impacted. So you can set the policy between 60% to 90%.
Minimum number of instances (minNumReplicas) for instance groups with/without auto scaling should be 1, So Scale In at 0 is not possible.
For other signals/metrics also (HTTP Load balancing utilization, Stackdriver Monitoring Metric) Scale In at 0 is not possible.
Use Scale In controls. It helps if sudden load spikes occur.
Update:
The limitation that an additional autoscaling signal must be specified with scaling schedules is now gone and it is now possible to configure a schedule that would alternate between 0 and 1 instances (but see the general answer below).
When it is possible to scale to 0 instances:
min_num_replicas is set to 0.
Only these autoscaling signals are used: schedules or per-group Cloud Monitoring metrics (or both).
In particular it is not possible to scale to 0 when one of autoscaling signals is CPU utilization, LB utilization or per-instance Cloud Monitoring metrics.

AWS - EC2 Auto Scaling Health Check in relation to Cool Down

I have an Amazon EC2 Auto Scaling group, health check is 5 minutes to mark the instance healthy and default cooldown is 4 minutes. I have a scaling policy that will check if the CPU usage is at 70% for n datapoints for 1 min and add n of instances.
Will I have an issue? Does the scaling policy add to the default cooldown timing? My understanding is I will have an issue due to my default cooldown being less than my health check time.
Scenario: When new instances are launched and the health check hasn't passed yet (5 minutes) and another scaling happens (4 minutes). Is this an issue or valid statement?
Thanks in advance.
You probably don't want to be adding so many instances at once. Rather than removing 2 and adding 3, try configuring the scaling policies to add/remove one instance at a time.
The Health Check is used to identify unhealthy instances and remove/replace them in the Auto Scaling group.
The Cooldown is used to avoid adding/removing instances before the metric settles into a regular pattern. For example, an instance might take a while before it starts accepting requests, which could impact the metric you are using for scaling.
You shouldn't really need to worry about how they interact with each other, but make sure the health check has enough time for the instance to start responding to queries.

How to use AWS Autoscaling Effectively

Please lemme know the answer for below question:
In Reviewing the Auto scaling events for ur application you notice that application is scaling up and down multiple times in the same hour.What design you make to optimize for cost while preserving elasticity?
A.Modify Autoscaling group termination polict to terminate old Oldinstance first
B..Modify Autoscaling group termination polict to terminate old new instance first
C.Modify Cloud watch alarm period that triggers Autoscaling down policy
D.Modify auto scaling group cool down timers.
E.Modify the Autoscaling policy to use scheduled scaling Actions.
im guessing D&E ..Please suggest!!
This is a question from from the many "Become AWS certified!" websites. The purpose of such questions is to determine whether you understand AWS enough to be officially recognised via certification. If you are merely asking people for the correct answer, then you are only learning the answer... not the actual knowledge!
If you truly researched Auto Scaling and thought about it, here's some of the things you should be thinking about. I present this information hoping that you'll actually learn about AWS rather than just memorising answers (which won't help you in the real world).
Scaling In/Out vs Up/Down
Auto Scaling is all about launching additional Amazon EC2 instances when they are required (eg during times of peak load) and terminating them when they are no longer needed, thereby saving money.
Since instances are being added and removed, this is referred to as Scaling Out and Scaling In. Try to avoid using using terms such as Scaling Up and Scaling Down since they suggest that the instances are being made bigger and smaller (which is not the case).
Scaling Out & In multiple times per hour
The assumption in this statement is that such scaling is not desired, which is true. Amazon EC2 is charged per-hour, so adding instances and them removing them within a short period of time is wasting money. This is known as thrashing.
In general, it is a good idea to Scale Out quickly and Scale In slowly. When a system needs extra capacity (Scale Out), it will want it fairly quickly to satisfy demand. When it no longer needs as much capacity, it might be worthwhile waiting before Scaling In because demand might increase again very soon thereafter.
Therefore, it is important to get the right alarm to trigger a scaling action and to wait a while before trying to scale again.
Optimize for cost while preserving elasticity
When an exam question makes a statement about optimizing, it's giving you a hint that the primary goal should be cost minimization, even if other choices might make more sense. Therefore, you want the solution to Scale In when possible, while avoiding thrashing.
Termination Policies
When an Auto Scaling Policy is triggered to remove instances, Auto Scaling uses the termination policy to determine which instance(s) to remove. This is, therefore, irrelevant to the question because optimizing for cost while preserving elasticity is only impacted by the number of instances, not which instances are actually terminated.
CloudWatch Alarms
Auto Scaling actions can be triggered by CloudWatch alarms, such as "average CPU < 70% for 15 minutes". A rule with a longer time period means that it will react to longer-term changes rather than temporary changes, which certainly helps avoid thrashing. However, it also means that Auto Scaling will take longer to respond to changes in demand.
Cooldowns
From the Auto Scaling documentation:
The Auto Scaling cooldown period is a configurable setting for your Auto Scaling group that helps to ensure that Auto Scaling doesn't launch or terminate additional instances before the previous scaling activity takes effect. After the Auto Scaling group dynamically scales using a simple scaling policy, Auto Scaling waits for the cooldown period to complete before resuming scaling activities.
This is very useful, because newly-launched instances take some time (eg for booting, configuring) before they can take some of the application workload. If the cooldown is too short, then Auto Scaling might launch additional instances before the first one is ready. The result is that too many instances will be launched, meaning that some will need to Scale In soon after, leading to more thrashing.
Scheduled Actions
Instead of triggering Scale In and Scale Out actions based upon a metric, Auto Scaling can be configured to use Schedules actions. For example, increasing the minimum number of instances at 8am in the morning before an expected rush, and decreasing the minimum number at 6pm when usage starts to drop-off.
Scheduled Actions are unlikely to cause thrashing, since scaling is based on a schedule rather than metrics that frequently change.
The Correct Answer
The correct answer is... I'm not going to tell you! However, by reading the above information and trying to grok how Auto Scaling works, you will hopefully come to a better understanding of the question and arrive at a suitable answer.
This way, you will have learned something rather than merely memorizing the answers.