I am working on a project that needs to use Lambda based custom termination policy. Note that some of the instances reserved with the ASG might be scale in protected.
My question is, when a scale in event occurs, does the ASG suggest scale-in protected instances and gives them as input(s) to the Lambda that runs the custom termination policy?
Thanks in advance!
Reference: https://docs.aws.amazon.com/autoscaling/ec2/userguide/lambda-custom-termination-policy.html
Tried reading the documentation and the documentation doesn't explicitly state that the ASG doesn't suggest scale-in protected instances to the Lambda. It's only mentioned that the ASG will not terminate the scale-in protected instance.
Related
I am facing an issue that I have integrated a lambda function for auto starting and stopping of a EC2 instance according to my office hours. However, The issue is that EC2 is on ASG and ASG creates redundant Instance automatically.
Could someone please suggest a way to schedule ASG operation to stop and start according to my requirements other than disabling the ASG or removing that instance from ASG.
You can suspend an Auto Scaling group (ASG) according to your EC2 auto-start and stop schedule by using AWS Lambda functions. Lambda functions can be used to trigger the ASG suspend and resume actions at the desired times. You can also use CloudWatch Events to trigger the Lambda functions at the desired times.
More details here
I solved this case by using automatic Scaling and creating one schedule to shut down by putting 0s in desired, min and max capacity.
And another scheduled to start by putting 1s in desired, min and max capacity(since my requirement is 1 instance at a time)
I am not sure if this is the best practice but I solved my issue via this technique.
I have AWS infra as ECS + ASG for running container with scale-in protection disabled (to save some amount during night time where less traffic).
Issue i am facing is during scale-in event ECS drained traffic from one container, But ASG terminates instance with container running.
Termination policy does not specify anything which solve this case.
How do i make in-sync such that only that instance get terminated with no container running?
Any suggestion will be helpful.
Thanks in advance.
So, this case can be handled by enabling managedTerminationProtection option while creating capacity-provider.
I am starting up a bunch of ec2 instances in an autoscaling group with autoscaling:EC2_INSTANCE_LAUNCHING and autoscaling:EC2_INSTANCE_TERMINATING lifecycle hooks. When I initiate an instance termination using aws management console, the instance gets terminated without waiting for me to complete the lifecycle action https://docs.aws.amazon.com/cli/latest/reference/autoscaling/complete-lifecycle-action.html
The instance state in autoscaling groups UI shows up as Terminating:Wait . The instance state in EC2 Instances UI shows up as Terminated . This is preventing me from taking an corrective actions before completing the lifecycle action and actually terminating the instance.
The same does not seem to apply for the case when I reduce the desired instances size in autoscaling group. It seems to go through the proper lifecycle stages when I reduce the desired instance size which in turn causes instance terminations.
Is this how aws asg lifecycle hooks are intended to work? They are pretty much useless for any asg instance terminations triggered outside of changing the desired instance size for the asg.
Yes, the lifecycle hooks will be called when Auto Scaling performs the scale-in/scale-out operation.
The fact that you are directly terminating the instance bypasses Auto Scaling, so it does not have an opportunity to activate the termination hooks. All it sees is that an instance is no longer healthy.
If you wish to terminate a specific instance in an Auto Scaling group, use terminate-instance-in-auto-scaling-group. That tells Auto Scaling to terminate the instance and the hooks will be used.
I have foreseen a problem that could happen with my application but I am unsure if it is possible to solve, and perhaps the architecture needs to be redesigned.
I am using an AutoScalingGroup (ASG) on AWS to create EC2 instances that host game servers that players can join. At the moment, the ASG is scaled manually via a matchmaking API which changes the desired capacity based on its needs. The problem occurs when a game server is finished.
When a game finishes, it signals to the matchmaker that it is finished and needs terminating, and the matchmaker will then scale down the ASG accordingly, however, it doesn't seem to know exactly which instance to remove, and it won't necessarily be the one that needs terminating.
I can terminate the instance, but then as the ASG desired capacity is never changed when the instance is terminated, another server is created.
Is there a way I can scale down the ASG, as well as specifying which servers to remove from the group?
In a nutshell, the default termination policy during scale in is designed to remove instances that use the oldest launch configuration.
Currently, Amazon EC2 Auto Scaling supports the following termination policie:
OldestInstance Terminate the oldest instance in the group. This option is useful when you're upgrading the instances in the Auto Scaling group to a new EC2 instance type. You can gradually replace instances of the old type with instances of the new type.
NewestInstance Terminate the newest instance in the group. This policy is useful when you're testing a new launch configuration but don't want to keep it in production.
OldestLaunchConfiguration Terminate instances that have the oldest launch configuration. This policy is useful when you're updating a group and phasing out the instances from a previous configuration.
ClosestToNextInstanceHour Terminate instances that are closest to the next billing hour. This policy helps you maximize the use of your instances and manage your Amazon EC2 usage costs.
Default Terminate instances according to the default termination policy. This policy is useful when you have more than one scaling policy for the group.
Instance protection
One of the possible solutions could be to use Instance protection. The auto-scaling provides an instance protection to control whether instance can be terminated when scaling-in.
Therefore, enable the instance protection for ASG to protect instances from scaling-in by default. Once you are done with you server, decrease a value of desired number of instances, remove instance protection from particular instance (either using CLI or SDK; note that this protection remains enabled for the rest of instances) and auto-scaling will terminate that exact instance.
For more information about instance protection, see Instance Protection
The oldest server is removed. If you want to scale down a specific server, you will have to kill that server before changing desired capacity.
I would like to prevent EC2 instance termination by Auto Scaling feature if that instance is in the middle of some sort of processing.
Background:
Suppose I have an Auto Scaling group that currently has 5 instances running.
I create an alarm on average CPU usage...
Suppose 4 of the instances are idle and one is doing some heavy processing...
The average CPU load will trigger the alarm and as a result the scale-down policy will execute.
How do I get Auto Scaling to terminate one of the idle instances and not the one that is in the middle of the processing?
Update
As noted by Ryan Walls (+1), AWS meanwhile provides Instance Protection to control whether Auto Scaling can terminate a particular instance when scaling in (see the introductory blog post Instance Protection for Auto Scaling for a walk through):
You can enable the instance protection setting on an Auto Scaling
group or an individual Auto Scaling instance. When Auto Scaling
launches an instance, the instance inherits the instance protection
setting of the Auto Scaling group. [...]
It's worth noting that this instance protection only applies to regular Auto Scaling scale in events:
Instance protection does not protect Auto Scaling instances from
manual termination through the Amazon EC2 console, the
terminate-instances command, or the TerminateInstances API. Instance
protection does not protect an Auto Scaling instance from termination
if it fails health checks and must be replaced. Also, instance
protection does not protect Spot instances in an Auto Scaling group
from interruption.
As usual, the feature is available via the AWS Management Console (menu Actions->Instance Protection->Set Scale In Protection)), the AWS CLI (set-instance-protection command), and the API (SetInstanceProtection API action).
The latter two options allow automation of the scenario at hand, i.e. one would need to enable instance protection before running 'heavy processing' jobs, and disable instance protection once they are finished so that the instance is eligible for termination again.
Initial Answer
This functionality is currently not available for Auto Scaling of Amazon EC2 instances - while you are indeed able to Configure [an] Instance Termination Policy for Your Auto Scaling Group, the available policies do not include such a (fairly advanced) concept:
Auto Scaling provides the following termination policy options for you
to choose from. You can specify one or more of these options in your
termination policy.
OldestInstance — Specify this if you want the oldest instance in your Auto Scaling group to be terminated. [...]
NewestInstance — Specify this if you want the last launched instance to be terminated. [...]
OldestLaunchConfiguration — Specify this if you want the instance launched using the oldest launch configuration to be
terminated. [...]
ClosestToNextInstanceHour — Specify this if you want the instance that is closest to completing the billing hour to be
terminated. [...]
Default — Specify this if you want Auto Scaling to use the default termination policy to select instances for termination.
I just successfully dealt with the problem of long-running jobs in an auto scaling group using the relatively recent lifecycle hook feature.
The problem with trying to choose an idle node to terminate, in my case, was that the process that chooses the idle node will race against processes that submit work to the nodes. In this case it's better to use a strategy where any node can be terminated, but termination happens gracefully so that no work is lost. You can then use all of the standard auto scaling policy stuff to manage scale-in and scale-out.
The termination lifecycle hook allows the user (or a process) to perform actions on the node after it has been placed into an intermediate state (labeled Terminating:Wait) by the auto scaling group. The user (or process) is then responsible for completing the lifecycle action via an AWS API call, resulting in the shutdown of the terminated EC2 instance.
The way I set this up, in short, is:
Create a role that allows auto scaling to post a message to an SQS queue.
Create an SQS queue for the termination messages.
Create a monitor script that runs as a service in each node. My script is a simple event-driven state machine that transitions in sequence from MONITORING (polling SQS for a termination message for the node) to DRAINING (polling a job queue until no work is being performed on the node) to TERMINATED (making the complete-lifecycle call).
Standard configuration for event-driven AWS auto-scaling; that is, creating CloudWatch alarms, and the auto-scaling policies for scale-in and scale-out.
One hinderance to this approach is that the lifecycle hook management isn't supported yet in the SDKs (boto, at least, doesn't support it AFAIK), nor are there Cloud Formation resources for the hooks.
The relevant AWS documentation is here:
http://docs.aws.amazon.com/AutoScaling/latest/DeveloperGuide/AutoScalingGroupLifecycle.html
Amazon has finally addressed this issue in a simpler way. There is now "instance protection" where you can mark your instance as protected and it will not be terminated during a "scale in".
See https://aws.amazon.com/blogs/aws/new-instance-protection-for-auto-scaling
aws-cli is your best friend..
Disable your scale down policy on your autoscaling group.
Create a cron job or scheduled task using aws-cli to:
2a. Get the EC2 instances associated with the autoscaling group
http://docs.aws.amazon.com/cli/latest/reference/autoscaling/describe-auto-scaling-instances.html
2b. Next monitor the cloudwatch statistics on the EC2 instances
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/US_SingleMetricPerInstance.html
http://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-statistics.html
2c. Terminate the idle EC2 instance(s) from your auto-scaling group
http://docs.aws.amazon.com/cli/latest/reference/autoscaling/terminate-instance-in-auto-scaling-group.html
You can use Amazon CloudWatch to achieve this:
http://aws.typepad.com/aws/2013/01/amazon-cloudwatch-alarm-actions.html. From the article:
You can use a similar strategy to get rid of instances that are tasked with handling compute-intensive batch processes. Once the CPU goes idle and the work is done, terminate the instance and save some money!
In this case, since you will be handling the termination, you will need to remove the scale-down policy. Also see another option: https://stackoverflow.com/a/19628453/432849.