AWS CloudWatch alarm for Lambda Invocation - amazon-web-services

I want to monitor a Lambda function based on the “invocation” metric. Specifically, I want the alarm to trigger if it hasn’t been invoked at least once. I've set up the following alarm and it hasn't triggered in 24 hours. I've made sure that Lambda has also not triggered however this alarm never goes into "ALARM" state. Any advice would be helpful.
Threshold: Invocations < 1 for 1 datapoints within 5 minutes
Statistic: Sum
Period: 5 minutes
Metric Name: Invocations
Namespace: AWS/Lambda
Datapoints to alarm: 1 out of 1

The Invocations metric is only written if there are invocations (i.e. it won't write a value of 0).
Therefore, for use cases such as this, where you want the alarm to trigger when the metric has not been written, you need to configure it to treat missing data as breaching.
Documentation for this is here.

Related

AWS CloudWatch Composite Alarms: Send Alert When 1 Alarm Has been "In Alarm" For Over a Certain Amount Of Time

So I'm trying to setup composite alarms on AWS. So far, I have most of it set up. At the moment, I have a composite alarm set up with 3 alarms. If any 2 of these 3 alarms trigger, then the composite alarm also triggers. This part works fine.
However, I am having trouble with part of my use case. I'd also like to make it so that if one of these alarms within the composite alarm stays in alarm for over a certain period of time, then an alert is also sent out.
Here's an example of the situation:
2 out of the 3 alarms turn on in any time period: Alert should be sent
1 out of the 3 alarms turn on for under a certain time period: Alert should not be sent
1 out of the 3 alarms turn on for over a certain time period: Alert should be sent
I've tried looking into the settings available on the alarms themselves, and there doesn't seem to be an option for what I'm trying to do.
I'm wondering if this would require a lambda function? Is it possible for a lambda function to keep track of how long an alarm has been in alarm?
As talked in the comment section above, I am providing you with a possible solution to your problem. The only blocker is that you can't have different time frame for the alarms, both should be the same.
So you will have (example)- Alarm 1(cpu) if for 15 min it's over 60%. Alarm 2(EFS connections) if for 15 min there are more than 10 connections.
Now the alarm will go off when both the statements are true. Also the alarm will go off when only Alarm 1 goes off.
This is how you are going to make this alarm.
As for testing, it depends on what type of alarms you are making. For example cpu and ram increment methods are widely available on stackoverflow.
Also with aws cli you can change state of an alarm. It's usually for a very small amount of time, maybe 10 seconds.
aws cloudwatch set-alarm-state --alarm-name "myalarm" --state-value ALARM --state-reason "testing purposes"
You need to find a method which can suite your needs better.

set a alarm to lambda function based on execution time

Is it possible to set an alarm when lambda functions take more than a specific time?
ex:- I want to set alarm if my lambda function takes more than 10 seconds to execute
AWS web console:
CloudWatch -> Alarm -> Create Alarm
Select metric:
Rely on the standard AWS lambda metric Duration:
Set the alarm condition
then set Notification etc. depends on your needs.
Set the alarm name and description.
As a result you will get a needed alarm. (Screen is with lower threshold (2000 ms))

Set AWS Cloudwatch Alarm datapoint timespan and action to shut it down

Follwing case:
We want an Alarm in AWS that reads the EstimatedCharges Metric of AmazonCloudWatch every 5 minutes (for potential log overflow). But the only timespan I can set are 6 hours, else it gives me "Insufficient" as Status. How can I change the metric so that I can use it with 5 minutes between each check?
And how can I make an action that will stop the Cloudwatch Logs when over X?
According to documentation.
CloudWatch Billing metrics are updated every 6 hours.
Thus, your Alert may change the status only every 6 hours.
Just set Treat Missing Data to notBreaching
notBreaching – Missing data points are treated as "good" and within the threshold,
More info: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html

AWS Cloudwatch Alarm status

I have set cloudwatch alarm to trigger SNS mail whenever some keywords are found in cloudwatch logs. (using metric filter)
When those keywords are detected, Alarm state gets changed from insufficient data to alarm & triggers SNS topic
Now, to move from Alarm state alarm to insufficient data it takes time randomly.
Is there any specific way it works, I expect it to come back to Alarm state insufficient data immediately after alarm state.
Any help would be appreciated. Thanks
The alarm has a metric period of 60 seconds and some evaluation period (let suppose 3; total equal 3 * 60 = 3 mints evaluation window).
The alarm will be in Alarm state if all the last 3 datapoints at 60 seconds interval are in Alarm State (above the threshold).
If any 1 in last 3 datapoint is below threshold then the Alarm will transition to OK.
BUT, if the latest all 3 datapoints are missing (say your metric filter did not match and as a result no metric was pushed), the Alarm waits longer than 3 periods to transition to InsufficientData and this is by design to accommodate network delays or processing delay.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html
Came across the same situation, used a period of 1 min and some x > threshold.
The state changes to Alarm immediately whenever the metric exceeds the threshold. But to change back to OK/ Insufficient data takes 6 mins. This happens only for missing data.
As per AWS Support this is the expected behavior of Cloudwatch Alarms, clear explanation can be found here https://forums.aws.amazon.com/thread.jspa?threadID=284182

AWS CloudWatch alarm for SQS Number of Messages Visible

I am trying to capture the event of a new message in my FIFO queue (as I want to avoid , infinite polling of Queue) .
For this purpose I am evaluating the CloudWatch alarm option with metrics ApproximateNumberOfMessagesVisible .
Following is my Alarm description-
Threshold: The condition in which the alarm will go to the ALARM state.ApproximateNumberOfMessagesVisible >= 0 for 1 minute
Actions:The actions that will occur when the alarm changes state.
In ALARM:
Send message to topic "topic_for_events_generated_bycloudwatch" (xyz#xyz)
Send message to topic "topic_for_events_generated_bycloudwatch"
Period:The granularity of the datapoints for the monitored metric.1 minute
Following are my queries -
Assuming there are more than 0 messages in the given Q - will this alarm raised only once when the condition met or every minute ?
During quick test I saw Alarm keeping moving between INSUFFICIENT and ALARM state in random other without any configuration changes, what could be rational ?
Screenshot of ApproximateNumberOfMessagesVisible metric graph
Screenshot of the log activity
Thanks in advance.
Regards,
Rohan K
Cloudwatch will alarm once the threshold is breached for state transition.
From the Docs
Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions simply because they are in a particular state, the state must have changed and been maintained for a specified number of periods.
But
After an alarm invokes an action due to a change in state, its
subsequent behavior depends on the type of action that you have
associated with the alarm. For Amazon EC2 and Auto Scaling actions,
the alarm continues to invoke the action for every period that the
alarm remains in the new state. For Amazon SNS notifications, no additional actions are invoked.
An Example:
In the following figure, the alarm threshold is set to 3 units and the
alarm is evaluated over 3 periods. That is, the alarm goes to ALARM
state if the oldest of the 3 periods being evaluated is breaching, and
the 2 subsequent periods are either breaching or missing. In the
figure, this happens with the third through fifth time periods, and
the alarm's state is set to ALARM. At period six, the value dips below
the threshold, and the state reverts to OK. Later, during the ninth
time period, the threshold is breached again, but for only one period.
Consequently, the alarm state remains OK.