Alert on Lambda failure with detailed info - amazon-web-services

I have a cloudWatch alert setup on all lambdas sending data to a an SNS topic
Using the metric as
sum(errors) across all functions
I get the notification as expected, but there is no information in there to identify which amongst my lambdas triggered the alarm or in other words which one failed
If I setup the alarm individually on each lambda, then I get the information on which one failed under Dimensions. But I have a lot of them and plan to add more and this process will become painful
How can I leverage cloudWatch to alert me on all lambda failures and also provide info on which lambda failed and the error message ?
Should this be implemented in a different way ?

The AWS Cloud Operations & Migrations Blog has a post published on this topic.
Instead of using CloudWatch Alarms as you are doing now, you can use a CloudWatch Logs subscription. Whenever a log entry matches a specific pattern that you specify, it will trigger a new Lambda function that can notify you however you choose. In the blog post, the Lambda uses SNS to send an email notification.
You can control what information gets included in the body of the notification by adjusting what the Lambda function sends to SNS. The log group name, log stream, and the error message itself can be included.

Related

How to know once all the SNS subscribers received the message?

Let's say I have set up a AWS SNS with 3 subscribers. I'd like to know when all of the subscribers received/processed the message in order to mark that message as processed by all 3, and to generate some metrics.
Is there a way to do this?
You can log delivery status for SNS topics to CloudWatch, but only for certain types of messages (AWS has no reliable way of knowing if some messages were received or not, such as with SMS or email).
The types of messages you can log are:
HTTP
Lambda
SQS
Custom Application (must be configured to tell AWS that the message is received)
To set up logging in SNS:
In the SNS console, click "Edit Topic"
Expand "delivery status logging"
Then you can configure which protocols to log and the necessary permissions to do so.
Once you're logging to CloudWatch, you can draw metrics from there.
If you need to be notified when the subscribers have received the messages, you could set up a subscription filter within cloudwatch to send the relevant log events to a lambda function, in which you would implement custom logic to notify you appropriately.
I mean successful processing by the consumer
Usually your consumers would have to indicate this somehow. This is use-case specific, therefore its difficult to speculate on exact solutions.
But just to give an example, a popular patter is Request-response messaging pattern. In here, your your consumers would use a SQS queue to publish outcome of the message processing. The producer(s) would pull the queue to get these messages, subsequently, knowing which messages were correctly process and which not.

AWS CloudWatch Rule triggered by Log Event

I want to create CloudWatch Rule that would be triggered upon creation of Log Event. For that reason as an event pattern I selected CloudWatch Logs service but when I try to generate some Cloud Watch logs the rule is not getting triggered. I can not find any example of using aws.logs as a source for an event and hence my question if I'm doing something wrong.
This is because the only events for logs available are AWS API Call via CloudTrail. CloudWatch Logs does not generate CloudWatch events on receiving new log entries.
For the Logs API call events to work, you need to setup CloudTrial trial.
However, if you want to trigger your lambda function based on log entries, I can recommend using subscription filters for lambda:
You can use subscriptions to get access to a real-time feed of log events from CloudWatch Logs and have it delivered to other services such as a Amazon Kinesis stream, Amazon Kinesis Data Firehose stream, or AWS Lambda for custom processing, analysis, or loading to other systems.

cloudwatch metric filter alert with logs

We are using cloudwatch metrics filter and setup alarm and send notification through SNS - Email.
We are wondering if it is possible to also see logs that triggers the alarm in the email? or is it possible to do so with the help with a custom lambda function?
Thanks
Locally, you may try to have the following set up.
CloudWatch Alarm
SNS-log-reader with lambda as a subscriber
SNS-send-mails with e-mails as a subscriber
So the logic would be - Alarm trigger SNS-log-reader. Then lambda which is mapped to *SNS-log-reader can read the logs from LogGroup, build the message in needed format and send to e-mail/s via SNS-send-mails (or Simple Email Service (Amazon SES))

Send Cloudwatch logs matching a pattern to SQS queue

I would like to send all Cloudwatch logs where the message of the console.log (appearing in my Cloudwatch logs) matches a certain pattern( for example including the word "postToSlack", or having a certain json field like "slack:true"...)
But I'm stuck at the very beginning of my attempts: I am first trying to implement the most basic task: send ALL cloudwatch logs written when my lambdas are executed (via console.logs placed inside the lambda functions) message to SQS (why? because I first try to make the simplest thing before complexifying with filtering which log to send and which log not to send).
So I created a Cloudwatch Rules > Event > Event Pattern like here below:
{
"source": [
"aws.logs"
]
}
and as a Target, I selected SQS and then a queue I have created.
But when I trigger for example my lambdas, they do appear in Cloudwatch logs, so I would have expected the log content to be "sent" to the queue but nothing is visible on SQs when I poll/check the content of the queue.
Is there something I am misunderstanding about cloudwatch Rules ?
CONTEXT EXPLANATION
I have lambdas that every hour trigger massively (at my scale:) with like maybe 300 to 500 executions of lambdas in a 1 or 2 minutes period.
I want to monitor on Slack all their console.logs (i am logging real error.stack javascript messages as well as purely informative messages like the result of the lambda output "Report Card of the lambda: company=Apple, location=cupertino...").
I could just use a http call to Slack on each lambda but Slack for incoming hooks has a limit of about 1 request per second, after that you get 429 errors if you try to send more than 1 incoming webhook per second... So I thought I'd need to use a queue so that I don't have 300+ lambdas writing to Slack at the same second, but instead controlling the flow from AWS to Slack in a centralized queue called slackQueue.
My idea is to send certain logs (see further down) from Cloudwatchto the SQS slackQueue, and then use this SQS queue as a lambda trigger and sending with this lambda batches of 10 messages (the maximum allowed by AWS; for me 1 message= 1 console.log) concatenated into one big string or array (whatever) to send it to my Slack channel (btw, you can concatenate and send in one call up to 100 slack messages based on Slack limits, so if i could process 100 messages=console.log and concatenate I would but the current batch size limit is 10 for AWS I think ), this way, ensuring I am not sending more than 1 "request" per second to Slack (this request having the content of 10 console.logs).
When I say above "certain logs", it means, I actually I don't want ALL logs to be sent to the queue (because I don't want them on Slack): indeed I don't want the purely "debugging" messages like a console.log("entered function foo"). which are useful during development but have nothing to do on Slack.
As regards some comments: I don't want to use , to my understanding (not expert of AWS) cloudwatch alarms, or metrics filters because they're quite pricy (I'd have those triggered hundreds of times every hour) and don't actually fit my need: I don't want only to read on Slack only when critical problem or a "problem" occurs (like CPU> xxx ...) but really send a regular filtered flow of "almost" all my logs to Slack to read the logs inside Slack instead of inside AWS as Slack is the tool opened all day long, that it's being used for logs/messages coming from other sources than AWS as a centralized place, and that pretty Slack attachment messages formatting is better digested by us. Of course the final lambda (the one sending the messages to slack) would do a bit of formatting to add the italic/bold/etc., and markdown required by slack to have nicely formatted "Slack attachements" but that's not the most complex issue here :)
#Mathieu, I guess you've misunderstood the CloudWatch Events with CloudWatch logs slightly.
What you need is a real time processing of the log data generated by your lambda functions, filter the logs based on a pattern and then store those filtered logs to your Slack for analysis.
But configuring a CloudWatch Event with SQS is similar like a SQS trigger to Lambda. Here, cloudWatch will trigger (send message to) the SQS queue. The content of the message is not your logs but either the default or custom message that you've created.
Solution #1:
Use Subscription filter to filter out the logs as per requirement and subscribe to AWS Kinesis/AWS Lambda/Amazon Kinesis Data Firehouse.
Using the filtered stream (Kinesis), trigger your lambda to push that data to Slack.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/Subscriptions.html
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html
Solution #2:
Push your cloudWatch logs to S3.
Create a notification event in S3 on 'ObjectCreated' event and use that to trigger a Lambda function.
In your Lambda function, write the logic to read the logs from S3 (equivalent to reading a file), filter them and push the filtered logs to Slack.

send notification alert when AWS Lambda function has an error

I have a AWS Lambda function running some process in my infrastructure. The Lambda is triggered every 8 hours using a CloudWatch rule. I am trying to raise a notification if any error happens into the Lambda process. I tried to use SES but that service is not available in that Region.
I will like to know any suggestions for this problem:
How to setup notifications when an error occurs in my Lambda functions ?
I am looking for suggestions. This questions never asked for doing my task. I will appreciate any official documentation but either way, any help is welcome.
Some suggestions:
Dead Letter Queues:
If your error causes failed invocations, you can use a Lambda Dead Letter Queue to send the event to an SNS topic or an SQS queue. If you send it to an SNS topic, you can directly subscribe to the topic via SNS or Email to get notified any time a message is published to that topic.
Multi-region SES:
If you're really set on using SES directly, SES clients can be instantiated with an explicit region provided -- as long as your lambda's execution role has the appropriate permissions, you can send email to SES from a different region. Here's documentation for instantiating the JS SES Client.
CloudWatch Logs:
If your error does not cause the invocation to fail, another option is using a CloudWatch Logs metric filter to aggregate failures and potentially alarm on them. If you're using NodeJS, you can simply log out via console.log(), console.error(), etc. and it will be written out to CWLogs. More details here.
You can subscribe an SNS topic to CloudWatch Alarms, and notify yourself in the same way as the DLQ.
As you gain experience with the error and learn how to process common errors, you could also subscribe another lambda to the SNS topic from the DLQ/CWLogs example to process it as it happens.