I'm configuring an alarm for an instance from the Amazon CLI. for example, to trigger a notification when the cpu is idle for 5min. but I want to set this alarm for a lot of Instances.
With this Bash Script I created one alarm for one instance :
aws cloudwatch put-metric-alarm --alarm-name cpu-mon --alarm-description "Alarm when CPU exceeds 70 percent" --metric-name CPUUtilization --namespace AWS/EC2 --statistic Average --period 300 --threshold 70 --comparison-operator GreaterThanThreshold --dimensions "Name=InstanceId,Value=i-12345678" --evaluation-periods 2 --alarm-actions arn:aws:sns:us-east-1:111122223333:MyTopic --unit Percent
So, I don't see how can I use this script to choose another instances, or eventually loop on that script, in order to choose another instances.
If you have a list of instance IDs you want to create alarms for you could do something like:
#!/bin/bash
instances=(instanceId1 instanceId2 etc)
for i in "${instances[#]}"; do
aws cloudwatch put-metric-alarm \
--alarm-name cpu-mon-${i} \
--alarm-description "Alarm when CPU exceeds 70 percent" \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--threshold 70 \
--comparison-operator GreaterThanThreshold \
--dimensions "Name=InstanceId,Value=${i}" \
--evaluation-periods 2 \
--alarm-actions arn:aws:sns:us-east-1:111122223333:MyTopic \
--unit Percent
done
You could also initially use the AWS CLI to grab instance IDs based on tags, instance names etc and then use those to create the alarms along the same lines.
Related
I am trying to write my own tools for AWS monitoring calling AWS api from console and later build some graphs.
aws cloudwatch get-metric-statistics --metric-name MemoryUtilization --start-time 2022-11-19 --end-time 2022-11-21 --period 3600 --dimensions Name=ClusterName,Value=<mycluster> --dimensions Name=ServiceName,Value=<service-name> --namespace ECS/ContainerInsights --statistics Average
and this is always reporting 0 datapoint, whereas I can see that there are metrics collected in the console.
what is wrong with the command?
issue was the multiple dimensions args are given by just separating then with a space, instead of adding another --dimensions
so this worked.
echo running $metric on $ServiceName
aws cloudwatch get-metric-statistics --metric-name $metric \
--start-time $start_time --end-time $end_time \
--period $period \
--dimensions Name=ClusterName,Value=<my-cluster> Name=ServiceName,Value=$ServiceName \
--namespace $namespace \
--statistics $statistics --output text > $ServiceName-$metric-$now.csv
I have multiple instances. I need to retrieve some metrics from a subset of them (not all) using aws cli. I have tried specifying the InstanceId multiple times as Dimension, but it only considers on of the values. For example the command above, only returns de metric values for instance i-xxxxxx (ignores i-yyyyyyy)
aws cloudwatch get-metric-statistics --namespace AWS/EC2 \
--metric-name CPUUtilization \
--statistics Maximum
--dimensions Name=InstanceId,Value=i-yyyyyyyyyy Name=InstanceId,Value=i-xxxxxxxxx \
--start-time 2018-08-01T00:00:00Z --end-time 2018-08-01T10:00:00Z --period 300
One additional comment: the subset is can be obtained by filtering the list of instances using a tag:
aws ec2 describe-instances --filter Name=tag:app,Values=myapp \
--query 'Reservations[*].Instances[*].InstanceId' --output text
You're using the wrong API: get-metric-statistics is intended to return the time-series data for a single metric, which is identified by all of its dimensions. I suspect that the CLI interprets this field as an associative array, so the second instance ID overwrote the first.
The simplest solution (assuming you're using bash on Linux) is to use a for loop to retrieve each set of metrics:
for instance in i-yyyyyyyyyy i-xxxxxxxxx
do aws cloudwatch get-metric-statistics \
--dimensions "Name=InstanceId,Value=${instance}" \
...
done
I am publishing a custom metric into CloudWatch with two dimensions:
aws cloudwatch --region ap-southeast-1 put-metric-data --namespace CustomNS --metric-name ApiReqCount --dimensions ApiName=TestAPI,ApplicationName=App1 --timestamp 2017-03-07T05:00:00.000Z --statistic-value Sum=7,Minimum=1,Maximum=7,SampleCount=1
If I query with all of the dimensions, it gives the datapoints:
aws cloudwatch --region ap-southeast-1 get-metric-statistics --namespace CustomNS --metric-name ApiReqCount --dimensions Name=ApiName,Value=TestAPI Name=ApplicationName,Value=App1 --statistics Sum --start-time 2017-03-05T00:00:00Z --end-time 2017-03-08T12:00:00Z --period 300
However when I query, and no dimensions or partial dimensions are specified, I do not get back datapoints:
aws cloudwatch --region ap-southeast-1 get-metric-statistics --namespace CustomNS --metric-name ApiReqCount --dimensions Name=ApiName,Value=TestAPI --statistics Sum --start-time 2017-03-05T00:00:00Z --end-time 2017-03-08T12:00:00Z --period 300
What I really want is that I need all datapoints returned when no dimensions are specified, I'd like to optionally filter these datapoints via Dimensions.
You can't query for partial dimension or mix dimensions even if they are identical.
From the documentation:
CloudWatch treats each unique combination of dimensions as a separate metric, even if the metrics have the same metric name. You can't retrieve statistics using combinations of dimensions that you did not specifically publish.
See examples on: Amazon CloudWatch Concepts - Dimensions
I have set up some cloudwatch alarms using the cli put-metric-alarm and all works fine. The only issue is it continues to keep sending notifications (every 5 mins) which can really clog up your inbox. Is there a way that I can prevent it from sending further notifications if an Alarm notification has already been sent for the day unless a state change has occurred?.
Here is the command I am using to create the Alarm:
aws cloudwatch put-metric-alarm --region xx-region-x
--alarm-name 'my CPU Check'
--alarm-description 'my - CPU usage'
--namespace AWS/EC2
--dimensions Name=InstanceId,Value=i-xxxxxxxx
--metric-name CPUUtilization
--statistic Average
--comparison-operator GreaterThanThreshold
--unit Percent
--period 60
--threshold 0
--evaluation-periods 3
--alarm-actions arn:aws:sns:xx-region-x:xxxxxxxxxxxx:my-topic
I have put together these commands to autoscale EC2 instances based on SQS queue size. I have run all commands and my queue is at 10 messages and a single instance hasn't been launched.
I am trying to figure out, what SQS queue my cloudwatch alarms are listening to? Also any help to indentify the issus is appreciated!
### Create Autoscaling Policy ###
aws autoscaling create-launch-configuration --launch-configuration-name my-lc --image-id ami-551c6d30 --instance-type m1.small
aws autoscaling create-auto-scaling-group --auto-scaling-group-name my-asg --launch-configuration-name my-lc --availability-zones "us-east-1a" "us-east-1c" --max-size 10 --min-size 0 --desired-capacity 0
# Scale up policy
aws autoscaling put-scaling-policy --policy-name my-sqs-scaleout-policy --auto-scaling-group-name my-asg --scaling-adjustment 1 --adjustment-type ChangeInCapacity
# Scale down policy
aws autoscaling put-scaling-policy --policy-name my-sqs-scalein-policy --auto-scaling-group-name my-asg --scaling-adjustment -1 --adjustment-type ChangeInCapacity
# Alarm to scale up
aws cloudwatch put-metric-alarm --alarm-name AddCapacityToProcessQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" --statistic Average --period 120 --threshold 3 --comparison-operator GreaterThanOrEqualToThreshold --dimensions Name=QueueName,Value=my-queue --evaluation-periods 2 --alarm-actions arn:aws:autoscaling:us-east-1:850082592395:scalingPolicy:6408b62d-9363-4252-a88c-5ffab08a8cb5:autoScalingGroupName/my-asg:policyName/my-sqs-scaleout-policy
# Alarm to scale down
aws cloudwatch put-metric-alarm --alarm-name RemoveCapacityFromProcessQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" --statistic Average --period 300 --threshold 1 --comparison-operator LessThanOrEqualToThreshold --dimensions Name=QueueName,Value=my-queue --evaluation-periods 2 --alarm-actions arn:aws:autoscaling:us-east-1:850082592395:scalingPolicy:4771ea64-2ebf-45ef-9328-50e058dc68b7:autoScalingGroupName/my-asg:policyName/my-sqs-scalein-policy
# Verify cloudwatch alarms
aws cloudwatch describe-alarms --alarm-names AddCapacityToProcessQueue RemoveCapacityFromProcessQueue
# Verify scaling policy
aws autoscaling describe-policies --auto-scaling-group-name my-asg
# Verify instances autoscaled
aws autoscaling describe-auto-scaling-groups --auto-scaling-group-name my-asg
The AWS Documentation states that:
Amazon Simple Queue Service sends data to CloudWatch every 5 minutes.
Additionally, you have specified an Average metric over several periods of time. Therefore, it will require several 5-minute periods to send Amazon SQS metrics to Amazon CloudWatch.
It's possible that the metric period (120 seconds) is too short to receive multiple updates from SQS, therefore causing INSUFFICIENT_DATA errors.
Start by trying to get the Alarm to trigger with a Maximum setting and play with the time periods. Once the Alarm is successfully triggering, play with the thresholds to get the desired behaviour.