Stopping EC2 instance when custom cloudwatch metric passes limit - amazon-web-services

I'm trying to find a way to make an Amazon EC2 instance stop automatically when a certain custom metric on CloudWatch passes a limit. So far if I've understood correctly based on these articles:
Discussion Forum: Custom Metric EC2 Action
CloudWatch Documentation: Create Alarms to Stop, Terminate, Reboot, or Recover an Instance
This will only work if the metric is defined as follows:
Tied to certain instance
With type of System/Linux
However in my case I have a custom metric that is actually not instance-related but "global" and if a certain limit is passed, I would need to stop all instances, no matter from which instance the limiting log is received.
Does anybody know if there is way to make this work? What I'd need is some way to make CloudWatch work like this:
If arbitrary custom metric value passes a certain limit -> stop defined instances not tied to the metric itself.
The main problem is that the EC2 option is greyed out as the metric is not tied to certain EC2 instance and I'm not sure if there's any way to do this without actually making the metric itself certain instance related.

Have the custom CloudWatch metric post alerts to an SNS topic.
Have the SNS topic trigger a Lambda function that shuts down your EC2 instances via a call to the AWS API.

Related

Is this the correct way to solve this problem or have I gone wrong?

I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm, I have created a target tracking scaling policy within my ASG and set the target value to 50 and with the alarm I have created a SNS to send me a notification to my email when it goes above the targeted value. But i'm not entirely sure if that's what was exactly asked for.
Is that what was meant by creating a 'Simple auto-scaling policy'? Any confirmation would be helpful
As mentioned in the above comments, you need to confirm with whoever made that request.
You said:
I am told to create a Simple auto-scaling policy and using CloudWatch to trigger an increase in resources based on an alarm,
Therefore, you should create a simple auto-scaling policy which will trigger an increase (scale out) in resources.
Note: when you create a new policy, the alarms are automatically created and managed for you. Once created, you can edit them as well. When you delete the policy, the alarms get deleted. There is usually an alarm for the "scale up" (HIGH alarm) and for the "scale in" (LOW alarm). But, again, these are automatically created for you when you create the policy.
So, I'd say you just need to setup autoscaling for whichever service you're working on (EC2, ECS, other; this is not mentioned in your question) and assign a policy to that (autoscale based on a target metric: CPU, Memory, number of requests, etc or based on some other custom metric).
In the end, you'd want to test it by applying a load test and confirm that your service scales out when the load thresholds are breached (the threshold, how many datapoints must breach the threshold in what period of time: these are all defined in your policy and associated alarms automatically created after you setup the policy).
So, to answer the question, only the person who made the request can confirm but I'll say with a good certainty that the end goal is to increase the resources under load. So, no, the goal is not to send an email. You might want to send an email as well. But I would bet a benjamin that what's really wanted here is for you to make some service autoscale under load (scale out and scale in).

Prevent CPUCreditBalance Alarm from firing on ec2 instance launch

I am using a CPUCreditBalance Cloudwatch alarm to alert me whenever CPUCreditBalance goes below a threshold (e.g <50).
However, the problem is when instance is just launched it starts with base credits (~20) or sometime even 0.
Is there a way I can prevent alarm from alerting when instance has just launched?
Update: One possible solution that I can think of is writing a custom lambda function that checks if the instance has just started and suppress the alarms for such cases.

How to start a specific instance, when another instance is overloaded?

I have 2 instances, connected to a load balancer. I would like to stop 1 instance, and start it only when a certain alarm happens, for example when the first intance has a high CPU load.
I couldn't find how to do it. in the Auto scaling group, i see i can launch a brand new instance, but that's not what i want, i want a specific instance to start.
I couldn't find how to connect an alert to an action - wake up this specific instance.
Should this be done in the load balancer configuration? i couldn't find how...
This is not really the way autoscaling is supposed to work, and hence the solution to your particular problem is a little bit more complex than simply using autoscaling to create new instances in response to metric thresholds being crossed. It may be worth asking yourself exactly why you need to do it this way and whether it could be achieved in the usual way.
In order to achieve starting (and stopping) a particular instance you'll need three pieces:
A CloudWatch Alarm triggered by the metric you need (CPUUtilization) crossing your desired threshold.
An SNS topic that is triggered by the alarm in the previous step.
A lambda function (with the correct IAM permissions) that is subscribed to the SNS topic, which sends the relevant API calls to EC2 to start or stop the instances when the notification from SNS arrives. You can find some examples of the code needed to do this eg here in node.js and here from AWS although there are probably others if you prefer another language.
Once you put all these together you should be able to respond to a change in CPU with starting and stopping particular instances.

What is the mechanism to get load on a box to trigger an autoscaling group in AWS?

I've got my web servers set up to autoscale at certain times of the day.
I can measure the load on the box using scripts executed by Consul - and this can trigger events at certain thresholds.
I want to push these two together and trigger autoscaling at certain load levels. (Assume CPU load at 75% is the threshold).
My question is: What is the mechanism to get load on a box to trigger an autoscaling group in AWS?
Assumptions:
I was not planning to use AWS Cloudwatch - but am interested if this is the solution.
I'm more interested in the autoscale triggering interface. Is it a queue or a rest endpoint?
As #mahdi said, you can easily use AWS Cloudwatch to do this.
However, if you want Consul (or anything outside the scope of an AWS "service") to do it, you can use lambda.
You would create a lambda function that scales your instance up or down (or both). Lambda can have many triggers, such as an HTTP endpoint through API Gateway. If you already have Consul set up to do it (sounds like you do since you said can trigger events at certain thresholds.) just make it issue the HTTP request to API Gateway to scale up or down.
You can create a CloudWatch alarm with a CPUUtilization metric and set it to change state when your instance has CPU utilization more than 75%. Then in the Auto Scaling Group, you use this alarm for scaling (in/out) policy. You can also control the number of instances in an Auto Scaling Group by manually (e.g. through your application running on one the instances) changing the Desired value. This documentation can be helpful.

AWS custom logging

Environment – Two Different ec2 instances running tomcat separately.
Requirement – If there is any Error in logs – we should get an alert.
Implementation –
We implemented AWS customer logging for this which is successfully sending alerts on the Error Pattern Matching.
It automatically created a log groups – “/opt/tomcat/logs/catalina.out”.
Under this log group – there are two log streams – two instances separately showing.
Problem –
Now I want separate alarm for separate instances
Problem is when I create an alarm – it does not let me choose the instance. It takes both instance by default, which means one alarm – monitoring both instances simultaneously. And sending alert without mentioning instance name. so it is difficult to find which instance has actually sent alert.
And the second problem is - we created few log metrics for testing – like on keyword – info – which we want to delete and not able to do so.
It appears that you are using the CloudWatch Logs functionality that permits automated sending of log files from an EC2 instance (or elsewhere) to the CloudWatch service. CloudWatch Logs can then be configured to look for strings in the log files, which will trigger the recording of metrics.
To create separate alarms for separate instances, each EC2 instance should be configured to use a different CloudWatch Log stream. The CloudWatch Logs agent takes a Destination Log Group name.
See: Quick Start: Install and Configure the CloudWatch Logs Agent on an Existing EC2 Instance
As for the metrics that you wish to delete, it is not possible to delete metrics from Amazon CloudWatch. However, metrics will automatically disappear after 14 days.