How to set an alarm not to trigger action after autoscaling - amazon-web-services

I have a problem setting my autoscaling group. I have created an alarm that when triggered makes the autoscaling add a new EC2 instance to it. The autoscaling has 200 seconds of Default Cooldown period, but the alarm keeps recording data during that time and is triggered again. That makes the autoscaling group launch another machine and end up entering a loop that makes the group raise all the available machines.
How can I configure the autoscaling group so that it ignores the second triggered alarm? Is there any point about the configuration that I seem to be missing? Thanks in advance.
EDIT:
These are the metrics and scaling policies that trigger my group:
And this is the reason why I think that the autoscaling is still receiving alarms. Because terminations and launchings overlap in time.

I am not sure which type of health check you are using but there is a condition called "grace period"
Frequently, an Auto Scaling instance that has just come into service needs to warm up before it can pass the health check. Amazon EC2 Auto Scaling waits until the health check grace period ends before checking the health status of the instance
https://docs.aws.amazon.com/autoscaling/ec2/userguide/healthcheck.html
That can be the configuration that you are missing
AWS autoscale ELB status checks grace period

Related

Autoscaling policy not launching new instances

I have setup different types of dynamic scaling policies within my autoscaling group that uses a launch template to try and see autoscaling launch new instances when these are triggered, but it doesn't do it. I just set up the following tracking policy, and here are the results:
The tracking policy
Alarm is in-alarm stated, and action was "Successfully executed"
Autoscaling activity is not reporting anything
The alarm always logs that an autoscaling action was triggered, however autoscaling does not log any activity.
This autoscaling group is set up with min: 1, max:6, currently there is only 1 instance running.
Where or how can I find the error that is causing this? Is it perhaps something related to permissions? All instances are healthy/in-service
Spent the past couple of days going through threads on stackoverflow and other forums and haven't found anything that can help me find where the issue is.
Also, the timestamps differ in the screenshots as I had just done a re-deployment, but it is the same behavior all the same before I had done that, there is never activity on the autoscaling instance from the action launched within the alarm...
I tried to replicate your scaling policy and environment:-
Created an ASG
assigned a tracking policy of 0.0001
My max capacity is 2, desired 1 and minimum 1.
The key point here is to wait for Cloudwatch ALarms to collect data from data points, for my scale-out activity it took around 5 minutes (the majority of the time is taken in warm up of instances too )
I just found the reason of this issue, under the autoscaling group, go to Details, then to Advanced configurations and then to Suspended processes, nothing should be selected in this field, in my case alarms was set there along a few other actions, that is the reason autoscaling wasn't running on cloudwatch alarms.
Autoscaling group advanced configurations section

Terminate specific ec2 instance in an autoscaling group

I've created aws cloudwatch alarm based on ASG's group metrics cpuutilization. It sends an email alert email whenever cpuutilization exceeds more than 99% for more than an hour.
Is there a way to execute an event/action that will terminate specific ec2 instances that triggered the alarm? These instances hang and has to be terminated.
I would create an additional alarm that would terminate any instance that reaches 99% cpu for an hour. This is directly supported by CloudWatch.
From Create Alarms to Stop, Terminate, Reboot, or Recover an Instance:
Using Amazon CloudWatch alarm actions, you can create alarms that automatically stop, terminate, reboot, or recover your EC2 instances. You can use the reboot and recover actions to automatically reboot those instances or recover them onto new hardware if a system impairment occurs.
See https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/UsingAlarmActions.html
I feel possible solution for this requirement is to write AWS Cli script which would run probably every 15 mins and get list of all EC2 instances running and then terminate if needed. Also, need historical info for ec2's w/c cpu is at 100% for more than 45mins

AWS autoscaling Lifecycle Terminating:Wait not working as I expected it to work

I have recently setup ASG lifecycle hooks. So the ASG instances go into terminating:wait lifecycle state for 5 minutes and terminated automatically. But I observed that during the time the Instances are in Terminating:Wait new scale up activity is not being triggered. i.e Even though the trigger condition is satisfied, only after all the instances in Terminating:Wait state get terminated, does the desired value gets set. Is it because ASG looks for Healthy Instances ( since the Terminating:Wait instances will be in healthy state) ? Can someone please help me out with this issue ?
Are you connecting the VPC while creating the Launch Configuration(LC) ,
Try creating LC with with out attaching the VPC and security group, and make sure you have selected the VPC in the ASG(Auto Scale Group)

Why does CPUUtilization Alarm have always INSUFFICIENT_DATA state?

I'm trying to create an Auto-Scaling Group, which will work based upon CPUUtilization of the Target Group.
I managed to created an Auto-Scaling group. When I execute the Scaling Policies via some test data. It works.
I created 2 alarms in Cloudwatch. However, those are in "INSUFFICIENT_DATA".
The alarms should be able to checking the CPUUtilization of the Auto-Scaling Group.
So, how can I run the autoscaling based on CPUUtilization of Target Group?
The screentshots are below:
The loadbalancer configuration
The target group configuration
The autoscaling configuration
The scaling policies
The CloudWatch alarm config
Alarm configuration for Scale-UP
Alarm configuration for Scale-Down
https://serverfault.com/questions/479689/cloudwatch-alarms-strange-behavior
I found this answer. My alarm works for period 1 minute. But my cloudwatch monitoring wasn't detailed.
When I change the alarm checking period from 1 minute to 1 hour, the alarm works. However, I need to execute the alarm for period 1 minute. I enabled the detailed monitoring but still doesn't work.

AWS Cloudwatch alarm for each single instance of an auto scaling group

We have configured an Auto Scaling group in AWS. And it works fine. We have configured some alarms for the group, such as: send alarm if the average CPUUtilization > 60 for 2 minutes ... use AWS CLI.
The only problem is, if we want to monitoring each instance in the group. We have to configure them manually. Are they any way to do it automatically like config, template?
Amazon CloudWatch alarms can be created on the Auto Scaling group as a whole, such as Average CPUUtilization. This is because alarms are used to tell Auto Scaling when to add/remove instances and such decisions would be based upon the group as a whole. For example, if one machine is 100% busy but another is 0% busy, then on average the group is only 50% busy.
There should be no reason for placing an alarm on the individual instances in an auto-scaling group, at least as far as triggering a scaling action.
There is no in-built capability to specify an alarm that will be applied against each auto-scaled instance individually. You could do it programmatically by responding to an Amazon SNS notification whenever an instance is added/removed by Auto Scaling, but this would require your own code to be written.
You can accomplish this with lifecycle hooks and a little lambda glue. When you have lifecycle events for adding or terminating an instance, you can create an alarm on that individual instance or remove it (depending on the event) via a lambda function.
To John's point, this is a little bit of an anti-pattern with horizontal scaling and load balancing. However, theory and practice sometimes diverge.