I'm researching the AWS CloudWatch SDK for Java and I see there's a limit of 5,000 alarms per account per region for PutMetricAlarm:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html
My situation is such that the number of alarms could potentially surpass this limit (i.e. transaction fails for a particular product). I wouldn't need to configure thresholds for a predetermined set of alarms. Rather, the alarm would be fired off ad hoc programmatically at the time failure is detected, with different failure possibilities that could reach well over 5,000.
Does CloudWatch support this scenario, either through PutMetricAlarm or otherwise?
You could:
Increase your account limits (contact the AWS support)
Use SetAlarmState API
Related
I have some ECS tasks running in AWS Fargate which in very rare cases may "die" internally, but will still show as "RUNNING" and not fail and trigger the task to restart.
What I would like to do, if possible is check for the absence of logs, e.g. if logs haven't been written in 30 minutes, trigger a lambda to kill the ECS task which will cause it to start back up.
The health check functionality isn't sufficient.
If this isn't possible, are there any other approaches I could consider?
you can have metric and anomaly detection but it may cost for metric to process logs + alarm may cost too. Would rather do lambda run every 30min which would check if logs are there and then would kill ECS as needed. you can run lambda on interval with cloudwatch events bridge.
Logs are probably sent to cloudwatch logs group from your ECS, if you have static name of the logs group, you can use SDK to describe streams inside the group. This api call will tell you timestamp of the last data in stream.
inside lambda nodejs context aws-sdk v2 is already present, so you can require w/o install. here is doc for v2:
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/CloudWatchLogs.html#describeLogStreams-property
pick to orderBy: "LastEventTime" and to save networking time, set limit from default 50 to 1 limit: 1 and in result you will have lastEventTimestamp
anomaly detection:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Anomaly_Detection.html
alarms:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html
check pricing for these, there is free tier, so maybe it won't cost you anything, yet it's easy to build up real $ spend with cloudwatch. https://aws.amazon.com/cloudwatch/pricing/
To run lambda on interval:
I want to create an alarm for a particular time window. So, the use case is if we see customer/traffic drop from 6:00 AM to 10 PM then we should get an alarm to know why customers are not using our service and to take some action. is this scenario possible through cloudwatch alarm? we have the number of request metric in place.
Amazon CloudWatch cannot specify time ranges, but since you want to know whether something "unusual" is happening, I would recommend you look at Using CloudWatch Anomaly Detection - Amazon CloudWatch:
When you enable anomaly detection for a metric, CloudWatch applies statistical and machine learning algorithms. These algorithms continuously analyze metrics of systems and applications, determine normal baselines, and surface anomalies with minimal user intervention.
See: New – Amazon CloudWatch Anomaly Detection | AWS News Blog
It should be able to notice if a metric goes outside of its "normal" range, and trigger an alarm.
I just started using AWS services. I want to receive notifications if any service usage exceeds limit. After searching for the options I found that same can be achieved suing AWS Cloudwatch alarm and AWS Limit Monitor using AWS CloudFormation. My question is, will i be charged if i use these services to receive notifications?
Yes, you can setup all kinds of notifications to keep a handle on what you are being billed, but that doesn't stop you from actually getting billed if you exceed your limits.
For example I have alerts to notify me when I reach 25%, 50%, 75% and 100% of my typical monthly spend - so I roughly should get one notification each week - but a lot can happen between when you get sent the notification, and when you take action - especially if, for example, someone got access to your account and started crypto-mining on some big ec2 instances.
We have a service running in aws ecs that we want to scale in and out based on 2 metrics.
Scale out when: cpu > 80% or connection_count > 9500
Scale in when: cpu < 50% and connection_count < 5000
We have access to both the cpu and connection count metrics and alarms in cloud watch. However, we can't figure out how to setup a dynamic scaling policy like this based on both of them.
Using the standard aws console interface for creating the auto scaling rules I don't see any options for multiple. Any links to a tutorial or aws docs on this would be appreciated.
Based on the responses posted in the support aws forums, nothing can be done for AND/OR/IF conditions. (https://forums.aws.amazon.com/thread.jspa?threadID=94984)
It does mention however that they already put a feature request to the cloudwatch team.
The following is mentioned as a workaround:
"In the meantime, a possible workaround can be to create a custom metric using a custom script which would run after every five minutes and get the data points from the CloudWatch metrics, then perform the AND or OR operation and then push the output to a custom metric. You can then create a CloudWatch alarm which would monitor this custom metric and then trigger actions accordingly."
I have an instance in AWS that from time to time it's CPU cross the threshold of 90%.
I have created an alert for this, however I saw that I received one notification only and it was during the first 5 minutes while the CPU was at 100% for 2 hours.
How do I set the metric so I will keep getting notifications all the time?
Cloudwatch does not send notifications continuously if the threshold is breached. Cloudwatch can send a Notification only when the state changes.
Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions simply because they are in a particular state, the state must have changed and been maintained for a specified number of periods.
Ref: AWS Cloudwatch Documentation
One possible solution that I can think of is to create a Multiple Cloudwatch Alarms with Multiple thresholds.
As the above answer already says it is not triggered again, one thing you can do is changing the alarm conditions to a very large value and then the orginal value and the state change will occur again.