I would like to create a CloudWatch alarm that sends an email when I forgot to delete my RDS instance after use. So I only want an alarm that triggers when the RDS instance is available. My initial approach is the following:
Create an alarm based on "CPUUtilization" and have it trigger when the utilization has on average been between 0 and 1 percent for about 1 or 2 hours.
However, until now I can only state 1 constraint. What I mean is that I can have the alarm trigger when the utilization is below 1 percent for about 1 or 2 hours. But this means that it will also trigger when the instance has been deleted.
Can anyone help me figuring out how to tackle this problem?
If you stop your RDS instance, it will stop publishing metrics. Your alarm will go into INSUFFICIENT_DATA state, so your ALARM actions won't be executed.
More about CloudWatch Alarms here: http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/AlarmThatSendsEmail.html
Related
I'm setting up an AWS EC2 Autoscaling Group (ASG) and it's using TargetTrackingScaling that is tracking a custom metric. This metric is published by each instance in the ASG every 30 seconds.
It's working fine, but I'd like the scale-out action to happen more quickly. The source of this looks to be due to the ASG Alarm that gets auto generated. To me it looks like it's waiting for at least 3 datapoints over 3 minutes before ringing the alarm.
Is there a way I can configure the ASG/Scaling policy such that the alarm only needs 1 datapoint (or less time) before deciding to ring the alarm? Or if that's not possible, can I create a custom alarm and use that instead of the alarm that the ASG auto generated.
In my AWS elastic server setup, I have configured 4 Alarms
Add instance when CPU utilization > 20
Add instance when TargetResponceTime > 0.9
Remove instance when CPU utilization < 20
Remove instance when TargetResponceTime < 0.9
What will happen if two or more alarms triggered together?
For Example
If alarm 1 and 2 triggered together will it add two instances?
If alarm 1 and 4 triggered together will it remove an instance and add one or will it stay neutral?
The alarms are working fine, but I want to understand the mechanism behind alarm action execution.
Any Idea?
Your auto scaling group has a cooldown period, so technically multiple actions cannot occur at the same time. The next action would occur after the cooldown period has passed.
This functionality is to stop exactly what you're talking about, with multiple instances scaling at once.
I think personally for what you're doing you should be making use of a composite CloudWatch alarm. By having an OR condition these 4 alarms could become 2, which would reduce the number of alarms you have to trigger an autoscaling action.
Is it possible to set a CloudWatch alarm for when we are approaching the limit of EC2 instances currently allowed on our account?
For instance, if limit for EC2 instances is currently 250, when instance number 240 is provisioned, I want an alarm to trigger.
If you have an auto scaling group which launches new instances and you want to control it, you can use GroupInServiceInstances which gives you the number of instances running as part of the ASG. Read more here.
Yes, you could do this with a Lambda function, a CloudWatch Metric and a CloudWatch alarm.
Your alarm would be configured to alarm on the metric, if it exceeds some threshold (the threshold being your instance limit).
Your Lambda function, would run on a schedule e.g. every 5 mins, and would do the following:
Use the ec2:DescribeAccountAttributes API to get the account instance limit and cloudwatch:DescribeAlarms to get the current threshold of the alarm. If they differ, the alarm threshold should be updated the the instance limit via the cloudwatch:PutMetricAlarm API.
Use the ec2:DescribeInstances API and count the number of instances that are running and publish the value to a custom CloudWatch metric with the cloudwatch:PutMetricData API.
If the value published to the metric exceeds the threshold of the alarm, it will fire. The lambda function will keep the alarm threshold configured to the limit of instances and will publish datapoints to the metric based on the number of instances currently running.
What is the best way to check the EC2 instance uptime and possibly send alerts if uptime for instance is more then N hours? How can it be organized with default AWS tools such as CloudWatch, Lambda ?
Here's another option which can be done just in CloudWatch.
Create an alarm for your EC2 instance with something like CPUUtilization - you will always get a value for this when the instance is running.
Set the alarm to >= 0; this will ensure that whenever the instance is running, it matches.
Set the period and consecutive periods to match the required alert uptime, for example for 24 hours you could set the period to 1 hour and the consecutive periods to 24.
Set an action to send a notification when the alarm is in ALARM state.
Now, when the instance has been on less than the set time, the alarm will be in INSUFFICIENT DATA state. Once it has been on for the uptime, it will go to ALARM state and the notification will be sent.
One option is to use AWS CLI and get the launch time. From that calculate the uptime and send it to Cloudwatch:
aws ec2 describe-instances --instance-ids i-00123458ca3fa2c4f --query 'Reservations[*].Instances[*].LaunchTime' --output text
Output
2016-05-20T19:23:47.000Z
Another option is to periodically run a cronjob script that:
calls uptime -p command
converts the output to hours
sends the result to Cloudwatch with dimension Count
After adding the cronjob:
add a Cloudwatch alarm that sends an alert when this value exceeds a threshold or if there is INSUFFICIENT DATA
INSUFFICIENT DATA means the machine is not up
I would recommend looking into an "AWS" native way of doing this.
If it is basically sending OS level metrics (e.g. Free Memory, Uptime, Disk Usage etc...) to Cloudwatch then this can be achieved by following the guide:
This installs the Cloudwatch Logs Agent on your EC2 instances.
http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/QuickStartEC2Instance.html
The great thing about this is you then get the metrics show up in Cloudwatch logs (see attached picture which shows the CW Logs interface in AWS Console.).
We have an opsworks stack with two 24x7 instances. Four time-based instances. Two load-based instances.
Our issue is with the load-based instances. We've spent a great deal of time creating meaningful-to-our-service cloudwatch alarms. Thus, we want the load-based instances in our stack to come UP when a particular cloudwatch latency alarm is in an ALARM state. I see that in the load-based instance configuration, you can define a cloudwatch alarm for bringing the instance(s) UP and you can define a cloudwatch alarm for bringing the instance(s) DOWN.
Thing is, when I select the specific cloudwatch alarm I want to use to trigger the UP, it removes that cloudwatch alarm from being selected as the trigger for DOWN. Why?
Specifically, we want our latency alarm (we'll call it the "oh crap things are slowing down" cloudwatch alarm) to trigger the load-based instances to START when in an ALARM state. Then, we want the "oh crap things are slowing down" cloudwatch alarm to trigger the load-based instances to SHUTDOWN when in an OK state. It would be rad if the load-based instances waited 15 minutes after the OK state of the alarm before shutting down.
The "oh crap things are slowing down" threshold is Latency > 2 for 3 minutes
Do I just need to create a new "oh nice things are ok" alarm with a threshold of Latency < 2 for 3 minutes to use as the DOWN alarm in the load-based instance configuration?
Sorry for the newbie question, just feel stuck.
From what I can tell, you have to add a second alarm that triggers only when the latency is below 2 for three minutes. If someone else comes up with a cleaner solution than this, I'd love to hear about it. As it is, you'll always have one of the alerts in a continuous state of alarm.