AWS Lambda function was trigger twice by CloudWatch event - amazon-web-services

I deployed a service written in Python2.7 using AWS Lambda, and it's about extracting data from some pages and sending results to a web app. The service is triggered by the AWS CloudWatch event (fixed rate of 5 mins).
However, I found out sometimes the service was triggered twice at a time. I got this because there were two log stream printed the same data and result but with different RequestID's. And the database had duplicate data, which showed that both worked successfully. It looked like the service was triggered twice almost at the same time for no reasons.
Does anyone experience the same thing, and how do you fix it? Or, is there a way to limit only one function can be executed at a time.

Yes. Some AWS services have SLA of at least once delivery. I have experienced this with CloudWatch and CloudTrail. I do not know if you can limit it only once. You have to check if the data has been processed already. I overcame this by making boto3 calls in my python code before processing the data. Without knowing your situation, it is difficult to suggest a solution.

Related

AWS cloudwatch: logs are getting created in different log streams for the single API hit

We are making use of AWS Lambda and have configured cloudwatch for logging. There is a cron job running every 5 minutes which is triggering the lambda function. The logs that are generated for the hit are getting created in different log streams. For reference, please check the image attached here:
So, let's say there is an API hit at 11:45, then for checking the logs I have to go through the log streams having last event time 2022-05-05 11:43:10 (UTC+05:30) , 2022-05-05 11:43:00 (UTC+05:30), 2022-05-05 11:38:11 (UTC+05:30) and 2022-05-05 11:38:02 (UTC+05:30) and so on. The reason is, for a single hit logs are getting created in different log streams. Some of the logs are in first log stream, some are in second, a few are in third one. Previously, all the logs were created in single log stream corresponding to a single hit. Is there anything that can be done to avoid this? as this makes debugging a time taking process.
This is how Lambda works: each Lambda execution environment gets its own log stream. If you need to look at logs across log streams, then the best "built-in" solution is CloudWatch Logs Insights, which works at the log-group level.
Update: this document describes the Lambda execution environment, and the conditions that cause creation/destruction of an environment.

GCP Pub/Sub invoking Cloud Function twice

I have a Google Cloud Function subscribed to a topic. Our Go API publishes a message to the topic when an email needs to be sent to a user. The GCF creates the email object and sends it to Sendgrid. The problem is that 90% of the time, the cloud functions gets invoked twice.
The acknowledgement deadline on the subscription is 600 seconds and it's clearly stated in the Docs that GCF acknowledges internally.
I understand that PubSub guarantees at-least-once delivery and GCF at-least-once execution for background functions. But still, this happens in most cases, I'm pretty sure that's not right either.
I'm 100% sure it's not our API that's sending 2 messages. The cloud function runs twice even when I manually publish a message from the GCP console to test.
So the execution_id is the same. Both executions take less than 1 second.
So I'm not sure what's going on, who is responsible for this duplication?
I'm guessing it's GCF seeing as both executions have the same ID?
Does anyone have any ideas about how to fix this?
I met almost the same situation. I fixed it by deleting Cloud Functions' entries and Cloud Pub/Sub's subscriptions, then recreating them. It seems work fine so far.

Does AWS guarantee my lambda function will be triggered 100%?

I set up my AWS workflow so that my lambda function will be triggered when a text file is added to my S3 bucket, and generally, it worked fine - when I upload a bunch of text files to the S3 bucket, a bunch of lambda will be running at the same time and process each text file.
But my issue is that occasionally, 1 or 2 files (out of 20k or so in total) did not trigger the lambda function as expected. I have no idea why - when I checked the logs, it's NOT that the file is processed by the lambda but failed. The log showed that the lambda was not trigger by that 1 or 2 files at all. I don't believe it's reaching the 1000 concurrent lambda limitation as well since my function runs faster and the peak is around 200 lambdas.
My question is: is this because AWS lambda does not guarantee it will be triggered 100%? Like the S3, there is always a (albeit tiny) possibility of failure? If not, how can I debug and fix this issue?
You don't mention how long the Lambdas take to execute. The default limit of concurrent executions is 1000. If you are uploading files faster than they can be processed with 1000 Lambdas then you'll want to reach out to AWS support and get your limit increased.
Also from the docs:
Amazon S3 event notifications typically deliver events in seconds but can sometimes take a minute or longer. On very rare occasions, events might be lost.
If your application requires particular semantics (for example, ensuring that no events are missed, or that operations run only once), we recommend that you account for missed and duplicate events when designing your application. You can audit for missed events by using the LIST Objects API or Amazon S3 Inventory reports. The LIST Objects API and Amazon S3 inventory reports are subject to eventual consistency and might not reflect recently added or deleted objects.

AWS Lambda: Monitoring lambda timeout that was triggered by SNS.

I have an AWS Lambda that was triggered by SNS message. Many time, it has reached the max duration allowed by AWS, and AWS killed it immediately.
I have to either dig into the Lambda logs or the lambda duration chart to find out about the error.
Are there a better way to report this kind of errors?
Yes, there are some 3rd party tools that help you monitor your environment and provide exactly that - filter on specific errors and drill down to what happened there (the input event, the outgoing HTTP requests etc.).
Moreover, you can also configure alerts on specific errors that you will get via slack/mail.
Disclosure: I work for Lumigo, a company that does exactly that.

Storing values through AWS lambda

I am using AWS Lambda to check the health status and then send out an email. If the health is down I want it to send an email only once.
This Lambda function runs every 20minutes or so and I would like to prevent it from sending out multiple emails in interval if things have broken. Is there a way store environment variables or something in the AWS eco system so that it knows the state between each lambda function runs. (that way it doesnt send out an email and knows it has sent an email already).
I have looked into creating an alarm and sending out notifications but the email sent out through alarm wont do and I would like to have a custom email sent out, so I am using AWS SES through lambda. There is a cloud watch alarm that turns on when there is an error but I cant seem to fetch the state of alarm through the aws-sdk (its apparently not there).
I have written the function in NodeJS
Any suggestions ?
I've implemented something like this a little differently. I too do not care for getting an email for each error, since the errors I receive from my AWS Lambdas do not require immediate attention. I prefer to get them once an hour.
So I write all the errors I receive to an SQS queue. I configure the AWS Lambdas, which are throwing the errors, to send certain errors (configurable via environment variables) to certain SQS queues. Cloudwatch rules (running whenever), configured to pull from specific SQS queues in the Cloudwatch rule definition, then execute an AWS Lambda passing in the rule definition containing the SQS queue to pull from. The Lambda called by the CloudWatch rule handles reading from the SQS queue then emailing the results.
For your case you could modify that process to read all the errors from SQS, then filter that data down to the results you want to send. I use SQS because the "errors" I get don't need to be persisted.
I could see two quick ways to store something like a "last_email_sent" value. The first would be in DynamoDB. This is part of the AWS "serverless" environment that doesn't require you to do much more than interact with it. You didn't indicate your development environment but there are multiple development environments that are supported.
The second would be with the SSM Parameter Store. You can store any number of parameters there too.
There are likely other ways to do this too. Both of these are a bit of overkill but they would work to store what you need.
Alright, I found a better way that is simpler without dealing with other constraints. The NodeJS sdk is limited as it is. When the service is down create an alarm through the sdk and the next time the lambda gets triggered check if the alarm exists and send an email. That way if you want to do some notification through alarm it is possible too.
I think in my question I said this was not possible (last part), which I will retract.
Here is the link for the sdk reference: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/CloudWatch.html