I am new to AWS & have configured a Cloudwatch alarm to reboot an EC2 instance, if its StatusCheckFailed fails.
The issue is that, I frequently make updates to the image AMI and need to terminate the old instance and re-launch it with new AMI.
This results in instance id getting changed and thus, the Cloudwatch alarm I initially created (with reference to instance-id)becomes outdated.
What is the best practice to keep the Cloudwatch alarm updated, when the instance-id changes ?
Related
I have an AWS::AutoScaling::AutoScalingGroup configuration that runs two instances of EC2. My question is - is it possible to attach CloudWatch alarms for both instances? For example I want to observe StatusCheckFailed_Instance metric for each EC2 in a group?
Usually you can attach alarms through the EC2 Instance ID but how to know each EC2 Instance ID in AutoScalingGroup to attach alerts? or here should be another way to attach alerts? I really can't find something useful and workable over internet.
Option 1)
Create your own script that's triggered on launch/terminate events
the scripts will each be set to trigger a lambda that would read in the instance ID and create/delete an alarm
Option 2)
If you're not trying to use the auto-recover option (which you shouldn't need in an ASG, since the ASG will just replace the instances), then you can make 1 aggregate alarm for the ASG
Create the alarm based on the StatusCheckFailed_Instance metric with the ASGName=<> Dimension
Set it to trigger if the MAX statistic value is > 1 (since that means at least 1 instance is failing, each instance will push its own datapoints to ASG versions of EC2 metrics)
Since you only have 2 instances, you can just manually check both if it ever triggers. But for larger ASGs using the SEARCH() math expression on the CloudWatch metrics console (or a dashboard) would be a good way to look through all the ASG instances and view their metrics to see which one is failing
I just figured out about AWS CloudWatch that would let you terminate your EC2 instance after some time in inactivity. So I created an alarm that would terminate the instance when the CPU usage is less the 1% for 2 hours. Which ultimately ended up putting my instance into an alarm state right away and it prevented me from starting it up to test the feature out.
I then deleted the CloudWatch alarm again in order to be able to launch the EC2 instance gain but even after I deleted the CloudWatch, the state is set to Terminated but the but the Start option from the Actions drop down button is still disabled.
How do I get the instance to start again?
You can't restart a terminated instance. The instance no longer exists. It is just listed as "terminated" in your web console for a little bit so you can see that it was deleted. You have to create a new instance now.
I've created aws cloudwatch alarm based on ASG's group metrics cpuutilization. It sends an email alert email whenever cpuutilization exceeds more than 99% for more than an hour.
Is there a way to execute an event/action that will terminate specific ec2 instances that triggered the alarm? These instances hang and has to be terminated.
I would create an additional alarm that would terminate any instance that reaches 99% cpu for an hour. This is directly supported by CloudWatch.
From Create Alarms to Stop, Terminate, Reboot, or Recover an Instance:
Using Amazon CloudWatch alarm actions, you can create alarms that automatically stop, terminate, reboot, or recover your EC2 instances. You can use the reboot and recover actions to automatically reboot those instances or recover them onto new hardware if a system impairment occurs.
See https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/UsingAlarmActions.html
I feel possible solution for this requirement is to write AWS Cli script which would run probably every 15 mins and get list of all EC2 instances running and then terminate if needed. Also, need historical info for ec2's w/c cpu is at 100% for more than 45mins
I have created an alarm in Cloudwatch to stop my EC2 instance, if the CPU < 5% for more than five minutes.
And after five minutes I get the email but the instance continues running. The instance is using an EBS volume and nothing in the history indicates a problem.
Can someone please tell me why the alarm I setup is not stopping the instance?
Thanks
According to the documentation, you need to create an IAM role so that AWS can automatically stop the instance on your behalf when the alarm is triggered. Also note that you cannot assign an IAM role to an existing instance.
Update:
You can now attach to or modify an IAM role of your existing instance. Read more here.
I am new to aws part,hope this may help...not sure...
Amazon EBS may not send metric data for an available volume that is not attached to an Amazon EC2 instance, because there is no metric activity to be monitored for that volume. If you have an alarm set for such a metric, you may notice its state change to Insufficient Data. This may simply be an indication that your resource is inactive, and may not necessarily mean that there is a problem.
While working with a g2.2xlarge spot instance, I have tried to set up an alarm that will notify me when the average CPU usage over a two hour period has dropped below 5% and will then automatically stop the instance. Here's a link to a nice article Amazon wrote up on how to use the stop/start instance feature. The AWS alarms seem to allow you to do this however after the trigger goes off I get this reply:
Dear AWS customer,
We are unable to execute the 'Stop' action on Amazon EC2 instance i-e60e21ec that you specified in the Amazon CloudWatch alarm awsec2-i-e60e21ec-Low-CPU-Utilization.
You may want to check the alarm configuration to ensure that it is compatible with your instance configuration. You can also attempt to execute the action manually.
These are some possible reasons for this failure and steps you can try to resolve it:
Incompatible action selected:
Your instance’s configuration may not be compatible with the selected action.
To execute the 'Terminate' action, your instance may have Termination Protection enabled. Disable this feature if you want to terminate your instance. Once you do that, the alarm will execute the action after the next applicable alarm state change.
To execute the 'Stop' action, your instance’s root device type must be an EBS volume. If the root device type is the instance store, select the 'Terminate' action instead. Once you do that, the alarm will execute the action after the next applicable alarm state change.
Temporary service interruption: There may have been an issue with Amazon CloudWatch or Amazon EC2. We have retried the action without success. You can try to execute the action manually, or wait for the next applicable alarm state change.
Sincerely, Amazon Web Services
Stop seems to be an option for the free micro instance but not for these other instances. When I try to change the shutdown behavior to stop in actions it says:
An error occurred while changing the shutdown behavior of this instance.
Modifying 'instanceInitiatedShutdownBehavior is not supported for spot instances.
Is there another way to get around this problem or will we have to wait until Amazon makes this feature available?
Use standard instances instead of spot instances. Spot instances allow you to bid on extra capacity within ec2. However, they may automatically shut down if the spot price exceeds your bid.
Its not really intended for an always on instance.