Question
I have set up a Laravel project that connects to AWS MediaLive for streaming.
Everything is working fine, and I am able to stream, but I couldn't find a way to see if a channel that was running had anyone connected to it.
What I need
I want to be able to see if a running channel has anyone connected to it via the php SDK.
Why
I want to show a stream on the user's side only if there is someone connected to it.
I want to stop a channel that has noone connected to it for too long (like an hour?)
Other
I tried looking at the docs but the closest thing I could find was the DescribeChannel command.
This however does not return any informations about the alerts. I also tried comparing the output of DescribeChannel when someone was connected and when noone was connected, but there was no difference
On the AWS site I can see the alerts on the channel page, but I cannot find how to view that from my laravel application.
Update
I tried running these from the SDK:
CloudWatch->DescribeAlarms();
CloudWatchLogs->GetLogEvents(['logGroupName'=>'ElementalMediaLive', 'logStreamName'=>'channel-log-stream-name']);
But it seems to me that their output didn't change after a channel started running without anyone connected to it.
I went on the console's CloudWatch and it was the same.
Do I need to first set up Egress Points for alerts to show here?
I looked into SNS Topics and lambda functions, but it seems they are for sending messages and notifications? can I also use this to stop/delete a channel that has been disconnected for over an hour? Are there any docs that could help me?
I'm using AWS MediaStore, but I'm guessing I can do the same as AWS MediaPackage? How can the threshold tell me if, and for how long no-one has been connected to a MediaLive channel?
Overall
After looking here and there in the docs I am assuming I have to:
1. set up a metric alarm that detects when a channel had no input for over an hour
2. Send the alarm message to the CloudWatchLogs
3. retrieve the alarm message from the SDK and/or the SNS Topic
4. stop/delete the channel that sent the alarm message
Did I understand this correctly?
Thanks for your post.
Channel alerts will go your AWS CloudWatch logs. You can poll these alarms from SDK or CLI using a command of the form 'aws cloudwatch describe-alarms'. Related log events may be retrieved with a command of the form 'aws logs get-log-events'.
You can also configure a CloudWatch rule to propagate selected service alerts to an SNS Topic which can be polled by various clients including a Lambda function, which can then take various actions on your behalf. This approach works well to aggregate the alerts from multiple channels or services.
Measuring the connected sessions is possible for MediaPackage endpoints, using the 2xx Egress Request Count metric. You can set a metric alarm on this metric such that when its value drops below a given threshold, and alarm message will be sent to the CloudWatch logs mentioned above.
With Regard to your list:
set up a metric alarm that detects when a channel had no input for over an hour
----->CORRECT.
Send the alarm message to the CloudWatchLogs
----->The alarm message goes directly to an SNS Topic, and will be echoed to your CloudWatch logs. See: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html
a Lambda Fn will need to be created to process new entries arriving in the SNS topic (queue) mentioned above, and take a desired action. This Lambda Fn can send API or CLI calls to stop/delete the channel that sent the alarm message. You can also have email alerts or other actions triggered from the SNS Topic (queue); refer to https://docs.aws.amazon.com/sns/latest/dg/sns-common-scenarios.html
Alternatively, you could do everything in one lambda function that queries the same MediaPackage metric (EgressRequestCount), evaluates the response and takes a yes/no action WRT shutting down a specified channel. This lambda function could be scheduled to run in a recurring fashion every 5 minutes to achieve the desired result. This approach would be simpler to implement, but is limited in scope to the metrics and actions coded into the Lambda Function. The Channel Alert->SNS->LAMBDA approach would allow you to take multiple actions based on any one Alert hitting the SNS Topic (queue).
Related
I have a cloudWatch alert setup on all lambdas sending data to a an SNS topic
Using the metric as
sum(errors) across all functions
I get the notification as expected, but there is no information in there to identify which amongst my lambdas triggered the alarm or in other words which one failed
If I setup the alarm individually on each lambda, then I get the information on which one failed under Dimensions. But I have a lot of them and plan to add more and this process will become painful
How can I leverage cloudWatch to alert me on all lambda failures and also provide info on which lambda failed and the error message ?
Should this be implemented in a different way ?
The AWS Cloud Operations & Migrations Blog has a post published on this topic.
Instead of using CloudWatch Alarms as you are doing now, you can use a CloudWatch Logs subscription. Whenever a log entry matches a specific pattern that you specify, it will trigger a new Lambda function that can notify you however you choose. In the blog post, the Lambda uses SNS to send an email notification.
You can control what information gets included in the body of the notification by adjusting what the Lambda function sends to SNS. The log group name, log stream, and the error message itself can be included.
I'm trying to setup notifications to be sent from our AWS Lambda instance to a Slack channel. I'm following along in this guide:
https://medium.com/analytics-vidhya/generate-slack-notifications-for-aws-cloudwatch-alarms-e46b68540133
I get stuck on step 4 however because the type of alarm I want to setup does not involve thresholds or anomalies. It involves a specific error in our code. We want to be notified when users encounter errors when attempting to login in or sign up. We have try/catch blocks in our Node.js backend to log errors to CloudWatch at various points in the login/signup flow where we think the errors are most likely happening. We would like to identify when those SPECIFIC errors are occurring and send a notification to a Slack channel built for this purpose.
So in step 4 of the article, what would I have to do to set this up? Or is the approach in this article simply the wrong one for my purposes?
Thanks.
The step 4 titled "Create a CloudWatch Alarm" uses CPUUtlization metric to trigger an alarm.
In your case, since you want to use CloudWatch Logs, you would create CloudWatch Metric Filters based on the logs entries of interest. This would produce custom metrics based on your error string. Subsequently, you would create CloudWatch Alarm of this metric as shown in the linked tutorial for CPUUtlization.
I have my application which processes messages from a pubsub topic and if it fails the message is send to a separate dlq topic. I want to be able to set an alarm in monitoring that when during a day there were 30k messages sent to the dlq it notifies me and I can check why my service is not wokring.
I tried to set up some polices in gcp but I don't know and couldn't find anywhere in the docs how to setup a metric of daily processed messages on a topic.
Can anyone help me ?
You can create a new alert policy like this
PubSub subscription/unacked messages.
You can add a filter on your subscription name if you have several subscriptions in your project.
Add the notification channel that you want, an email in my case. After few minutes, you can see the first alert
And the email
EDIT
For the acked messages, you can do this
I never tried an aggregation over 1 day, but it should be OK.
Please check the following GCP community tutorials which outline how to create an alert-based event archiver with Stackdriver and Cloud Pub/Sub
https://cloud.google.com/community/tutorials/cloud-pubsub-drainer
I had setup an Alert for CPU utilization on EC2 instance. Created one SNS topic to send alerts on mail. It sends me an alert when CPU utilization goes to ALARM state but I want repeated alerts till ALARM state get resolved. Please help me... I'm newbie to AWS.
What you can do is setup a Lambda function with a CloudWatch event trigger so that it runs periodically, and inside it call the CloudWatch GetMetricStatistics API. Then, simply check if it is above or below your preferred threshold (or if you want, whether or not it's in Alarm state) and publish a message to SNS. There are a lot of SDK documentations on how to use these API's with your preferred language.
It is not possible to get repeated notifications after getting into the ALARM state. As the alarm is entering the ALARM state only once that means the notification via Amazon SNS will be sent only once.
Autoscaling policy will be triggered by the same alarm. But mail will be sent only once.
I would like to send all Cloudwatch logs where the message of the console.log (appearing in my Cloudwatch logs) matches a certain pattern( for example including the word "postToSlack", or having a certain json field like "slack:true"...)
But I'm stuck at the very beginning of my attempts: I am first trying to implement the most basic task: send ALL cloudwatch logs written when my lambdas are executed (via console.logs placed inside the lambda functions) message to SQS (why? because I first try to make the simplest thing before complexifying with filtering which log to send and which log not to send).
So I created a Cloudwatch Rules > Event > Event Pattern like here below:
{
"source": [
"aws.logs"
]
}
and as a Target, I selected SQS and then a queue I have created.
But when I trigger for example my lambdas, they do appear in Cloudwatch logs, so I would have expected the log content to be "sent" to the queue but nothing is visible on SQs when I poll/check the content of the queue.
Is there something I am misunderstanding about cloudwatch Rules ?
CONTEXT EXPLANATION
I have lambdas that every hour trigger massively (at my scale:) with like maybe 300 to 500 executions of lambdas in a 1 or 2 minutes period.
I want to monitor on Slack all their console.logs (i am logging real error.stack javascript messages as well as purely informative messages like the result of the lambda output "Report Card of the lambda: company=Apple, location=cupertino...").
I could just use a http call to Slack on each lambda but Slack for incoming hooks has a limit of about 1 request per second, after that you get 429 errors if you try to send more than 1 incoming webhook per second... So I thought I'd need to use a queue so that I don't have 300+ lambdas writing to Slack at the same second, but instead controlling the flow from AWS to Slack in a centralized queue called slackQueue.
My idea is to send certain logs (see further down) from Cloudwatchto the SQS slackQueue, and then use this SQS queue as a lambda trigger and sending with this lambda batches of 10 messages (the maximum allowed by AWS; for me 1 message= 1 console.log) concatenated into one big string or array (whatever) to send it to my Slack channel (btw, you can concatenate and send in one call up to 100 slack messages based on Slack limits, so if i could process 100 messages=console.log and concatenate I would but the current batch size limit is 10 for AWS I think ), this way, ensuring I am not sending more than 1 "request" per second to Slack (this request having the content of 10 console.logs).
When I say above "certain logs", it means, I actually I don't want ALL logs to be sent to the queue (because I don't want them on Slack): indeed I don't want the purely "debugging" messages like a console.log("entered function foo"). which are useful during development but have nothing to do on Slack.
As regards some comments: I don't want to use , to my understanding (not expert of AWS) cloudwatch alarms, or metrics filters because they're quite pricy (I'd have those triggered hundreds of times every hour) and don't actually fit my need: I don't want only to read on Slack only when critical problem or a "problem" occurs (like CPU> xxx ...) but really send a regular filtered flow of "almost" all my logs to Slack to read the logs inside Slack instead of inside AWS as Slack is the tool opened all day long, that it's being used for logs/messages coming from other sources than AWS as a centralized place, and that pretty Slack attachment messages formatting is better digested by us. Of course the final lambda (the one sending the messages to slack) would do a bit of formatting to add the italic/bold/etc., and markdown required by slack to have nicely formatted "Slack attachements" but that's not the most complex issue here :)
#Mathieu, I guess you've misunderstood the CloudWatch Events with CloudWatch logs slightly.
What you need is a real time processing of the log data generated by your lambda functions, filter the logs based on a pattern and then store those filtered logs to your Slack for analysis.
But configuring a CloudWatch Event with SQS is similar like a SQS trigger to Lambda. Here, cloudWatch will trigger (send message to) the SQS queue. The content of the message is not your logs but either the default or custom message that you've created.
Solution #1:
Use Subscription filter to filter out the logs as per requirement and subscribe to AWS Kinesis/AWS Lambda/Amazon Kinesis Data Firehouse.
Using the filtered stream (Kinesis), trigger your lambda to push that data to Slack.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subscriptions.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html
Solution #2:
Push your cloudWatch logs to S3.
Create a notification event in S3 on 'ObjectCreated' event and use that to trigger a Lambda function.
In your Lambda function, write the logic to read the logs from S3 (equivalent to reading a file), filter them and push the filtered logs to Slack.