I had setup an Alert for CPU utilization on EC2 instance. Created one SNS topic to send alerts on mail. It sends me an alert when CPU utilization goes to ALARM state but I want repeated alerts till ALARM state get resolved. Please help me... I'm newbie to AWS.
What you can do is setup a Lambda function with a CloudWatch event trigger so that it runs periodically, and inside it call the CloudWatch GetMetricStatistics API. Then, simply check if it is above or below your preferred threshold (or if you want, whether or not it's in Alarm state) and publish a message to SNS. There are a lot of SDK documentations on how to use these API's with your preferred language.
It is not possible to get repeated notifications after getting into the ALARM state. As the alarm is entering the ALARM state only once that means the notification via Amazon SNS will be sent only once.
Autoscaling policy will be triggered by the same alarm. But mail will be sent only once.
Related
Question
I have set up a Laravel project that connects to AWS MediaLive for streaming.
Everything is working fine, and I am able to stream, but I couldn't find a way to see if a channel that was running had anyone connected to it.
What I need
I want to be able to see if a running channel has anyone connected to it via the php SDK.
Why
I want to show a stream on the user's side only if there is someone connected to it.
I want to stop a channel that has noone connected to it for too long (like an hour?)
Other
I tried looking at the docs but the closest thing I could find was the DescribeChannel command.
This however does not return any informations about the alerts. I also tried comparing the output of DescribeChannel when someone was connected and when noone was connected, but there was no difference
On the AWS site I can see the alerts on the channel page, but I cannot find how to view that from my laravel application.
Update
I tried running these from the SDK:
CloudWatch->DescribeAlarms();
CloudWatchLogs->GetLogEvents(['logGroupName'=>'ElementalMediaLive', 'logStreamName'=>'channel-log-stream-name']);
But it seems to me that their output didn't change after a channel started running without anyone connected to it.
I went on the console's CloudWatch and it was the same.
Do I need to first set up Egress Points for alerts to show here?
I looked into SNS Topics and lambda functions, but it seems they are for sending messages and notifications? can I also use this to stop/delete a channel that has been disconnected for over an hour? Are there any docs that could help me?
I'm using AWS MediaStore, but I'm guessing I can do the same as AWS MediaPackage? How can the threshold tell me if, and for how long no-one has been connected to a MediaLive channel?
Overall
After looking here and there in the docs I am assuming I have to:
1. set up a metric alarm that detects when a channel had no input for over an hour
2. Send the alarm message to the CloudWatchLogs
3. retrieve the alarm message from the SDK and/or the SNS Topic
4. stop/delete the channel that sent the alarm message
Did I understand this correctly?
Thanks for your post.
Channel alerts will go your AWS CloudWatch logs. You can poll these alarms from SDK or CLI using a command of the form 'aws cloudwatch describe-alarms'. Related log events may be retrieved with a command of the form 'aws logs get-log-events'.
You can also configure a CloudWatch rule to propagate selected service alerts to an SNS Topic which can be polled by various clients including a Lambda function, which can then take various actions on your behalf. This approach works well to aggregate the alerts from multiple channels or services.
Measuring the connected sessions is possible for MediaPackage endpoints, using the 2xx Egress Request Count metric. You can set a metric alarm on this metric such that when its value drops below a given threshold, and alarm message will be sent to the CloudWatch logs mentioned above.
With Regard to your list:
set up a metric alarm that detects when a channel had no input for over an hour
----->CORRECT.
Send the alarm message to the CloudWatchLogs
----->The alarm message goes directly to an SNS Topic, and will be echoed to your CloudWatch logs. See: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html
a Lambda Fn will need to be created to process new entries arriving in the SNS topic (queue) mentioned above, and take a desired action. This Lambda Fn can send API or CLI calls to stop/delete the channel that sent the alarm message. You can also have email alerts or other actions triggered from the SNS Topic (queue); refer to https://docs.aws.amazon.com/sns/latest/dg/sns-common-scenarios.html
Alternatively, you could do everything in one lambda function that queries the same MediaPackage metric (EgressRequestCount), evaluates the response and takes a yes/no action WRT shutting down a specified channel. This lambda function could be scheduled to run in a recurring fashion every 5 minutes to achieve the desired result. This approach would be simpler to implement, but is limited in scope to the metrics and actions coded into the Lambda Function. The Channel Alert->SNS->LAMBDA approach would allow you to take multiple actions based on any one Alert hitting the SNS Topic (queue).
In our team's infrastructure we have a Databricks job which sends data to an SQS queue which triggers a Lambda function. The Databricks job runs one in every 30 minutes. A week ago the Databricks job was failing continuously so it was not sending data, therefore the Lambda function was not triggered. Is there any way to set up an alert so that I get notified if the lambda function is not triggered for a period of 2 hours?
When I searched for a solution I was only able to see to get an alert if and when a Lambda fails or if a specific log type is found in its cloudwatch logs etc, but couldn't see any solution for the above scenario.
You can create a Cloudwatch alarm for the Invocation metrics for that lambda; you can configure the alarm so that if there are no invocations over a timespan of two hours, it goes into an ALARM state.
If you wish to be notified, you can also configure the Cloudwatch alarm to send a message to an SNS topic, which can then be configured to trigger SES so that it sends you an email (for example).
We are using cloudwatch metrics filter and setup alarm and send notification through SNS - Email.
We are wondering if it is possible to also see logs that triggers the alarm in the email? or is it possible to do so with the help with a custom lambda function?
Thanks
Locally, you may try to have the following set up.
CloudWatch Alarm
SNS-log-reader with lambda as a subscriber
SNS-send-mails with e-mails as a subscriber
So the logic would be - Alarm trigger SNS-log-reader. Then lambda which is mapped to *SNS-log-reader can read the logs from LogGroup, build the message in needed format and send to e-mail/s via SNS-send-mails (or Simple Email Service (Amazon SES))
I have a CloudWatch alarm created for testing purposes. It checks the number of bytes read on a Kinesis Stream. If less than 1 bytes are received within 1 minute, it triggers alarm and send email via SNS. So, I get email after 1 min, but then after that I don’t get any further email. Is it right the email notification sent only once? In my test data is not flowing all the time. So, ideally, it should send email every minute. Correct?
Whether or not the action keeps firing depends on the type of action.
SNS actions only trigger once when the state changes to ALARM. Other actions such as EC2 auto scaling keep triggering as long as the alarm is in ALARM state.
If your alarm reverts to OK state and then back again to ALARM, SNS will get triggered again.
AWS docs
If CloudWatch alarm switched to ALARM state after specified period (1 minute), then in your case it means it received less than 1 byte. When switched, it will trigger configured actions (in your case it is email notification). As long as alarm remains in the ALARM state without switching it state back to OK, nothing will be triggered again.
if alarm returned back to OK state and then again after the specified period it switched to ALARM, configured action will be triggered again.
For more information, refer to documentation.
We have aws alarms set up to email on alarm but we would like to continue to get the alarm notification even if the state is in Alarm without a state change. How could I achieve this (would be happy to use a lambda but no idea how to do it)
Amazon CloudWatch alarm notifications are only sent when the state of the alarm changes. It is not possible to configure CloudWatch to continually send notifications while in the ALARM state.
You would need to write your own code to send such notifications. This could be accomplished via a cron job, scheduled AWS Lambda function or your own application.
Try with a script using Cloudwatch API for example with Boto3 + Python or a Lambda running every X minutes. I have a python script to get values from cloudwatch you can adapt it. http://www.dbigcloud.com/cloud-computing/230-integrando-metricas-de-aws-cloudwatch-en-zabbix.html
One alternative is, to create a Lambda function to send email and host that function using CloudWatch Rule with Scheduled option and target as Lambda function that you have created. In Schedule option, you can set the frequency of time that you expect to receive email. In defined frequency, the Rule will trigger Lambda Function to send email.