AWS Lambda instance runs for more than 15 minutes - amazon-web-services

I have a lambda that is running an API, so every instance of the lambda starts a flask server, which runs until the lambda is killed. When another request is received, a new instance of the lambda is created- starting the flask server again.
Looking at the logs from one of my lambdas, a log stream ran for more than 15 minutes (~20 minutes before a process failed). The timeout is set to 5 min 0 seconds.
From what I understand, one long stream is generated per instance, an instance is killed after the timeout has been reached, and no timeout can be set for more than 15 minutes.
How do I have logs that span for ~20 minutes in the same log stream? Is the timeout the amount of time a lambda can run without generating logs before being killed? Is there a way to limit the time a lambda instance runs before the instance is terminated?
Initial log timestamp: 2022-02-20T11:05:05.970-07:00
Final log timestamp: 2022-02-20T11:25:29.895-07:00
Lambda Timeout: 5min0sec
START, END, REPORT statements -- I have multiple of these per log stream:
2022-02-20T11:05:06.509-07:00
START RequestId: 6807306e-f05f-425b-bd75-985b42794eaa
2022-02-20T11:05:07.203-07:00
END RequestId: 6807306e-f05f-425b-bd75-985b42794eaa
2022-02-20T11:05:07.203-07:00
REPORT RequestId: 6807306e-f05f-425b-bd75-985b42794eaa Duration: 691.58 ms Billed Duration: 2856 ms Memory Size: 128 MB Max Memory Used: 119 MB Init Duration: 2163.68 ms
XRAY TraceId: 1-621282cf-00745c0f030b1d810fad9710 SegmentId: 3effe66c64dc69d4 Sampled: true
2022-02-20T11:05:07.767-07:00
START RequestId: 8c724d99-8640-4634-88b9-1efb774c54a9 Version: $LATEST
2022-02-20T11:05:07.845-07:00
END RequestId: 8c724d99-8640-4634-88b9-1efb774c54a9
2022-02-20T11:05:07.845-07:00
REPORT RequestId: 8c724d99-8640-4634-88b9-1efb774c54a9 Duration: 76.09 ms Billed Duration: 77 ms Memory Size: 128 MB Max Memory Used: 119 MB
XRAY TraceId: 1-621282d3-55a51f9d69552d8905b959b7 SegmentId: 47dc3b533278679f Sampled: true
2022-02-20T11:05:08.182-07:00
START RequestId: ee69e296-af07-49a2-8264-ed9758d03992 Version: $LATEST
2022-02-20T11:05:08.263-07:00
END RequestId: ee69e296-af07-49a2-8264-ed9758d03992
2022-02-20T11:05:08.263-07:00
REPORT RequestId: ee69e296-af07-49a2-8264-ed9758d03992 Duration: 78.26 ms Billed Duration: 79 ms Memory Size: 128 MB Max Memory Used: 119 MB
XRAY TraceId: 1-621282d4-7c8f7ea75ead3f7c166a44a6 SegmentId: 2dcaed42621dbebd Sampled: true
... the final start/end that fails.
2022-02-20T11:25:29.749-07:00
START RequestId: 5521e958-0ffb-4de7-b028-953a21ac2ac9 Version: $LATEST
2022-02-20T11:25:29.895-07:00
END RequestId: 5521e958-0ffb-4de7-b028-953a21ac2ac9
2022-02-20T11:25:29.895-07:00
REPORT RequestId: 5521e958-0ffb-4de7-b028-953a21ac2ac9 Duration: 142.60 ms Billed Duration: 143 ms Memory Size: 128 MB Max Memory Used: 119 MB
XRAY TraceId: 1-62128799-19bd91ab324cf0271e964434 SegmentId: 554bd90b4a8ba1c5 Sampled: true

I think your misconception here is When another request is received, a new instance of the lambda is created- starting the flask server again. This is not necessarily true. According to the documentation:
The first time you invoke your function, AWS Lambda creates an instance of the function and runs its handler method to process the event. When the function returns a response, it stays active and waits to process additional events.
This makes perfect sense, especially for an http api. If a new lambda instance had to spin up every request, database connections, caching layers, etc would all have to be reestablished. This would not be scalable or performant. So AWS is doing you a favor by keeping the lambda instance up and running, even though you only pay for the time it's actually working.
As for the timeout question, the timeout is the max time that the lambda will complete a single job, in your case responding to an API request. Not how long a lambda will stay active.

A single request will be terminated after 15 min, but the instance, where the request was worked on, stays for 4.5 hours.(last time I checked at least)
and it will write every request in the same log stream.
If another request comes in, when the lambda instance is "alive" and not busy, the same instance will be reused, and will write in the same log stream, as the previous call.
Only if no lambda instance is free to work on request, a new instance will be started.
This is the reason, why the "warm up" calls for some bad designed lambdas with long start time work.

Related

How to increase timeout of Amplify Nextjs SSR function from 10s to 30s?

I'm using a Next.js project on AWS Amplify. On my local environment, one of my API routes averages around 18-20 seconds to finish as it makes multiple (6 total) third-party API requests to services outside of AWS. However, When deploying to Amplify, the API route gets timed-out at 10 seconds.
I've tried manually setting the Amplify Lambda function to 30 seconds, this works BUT it gets overwritten back to 10 seconds after every deployment. (see screenshot below)
CloudWatch log:
...
END RequestId: 0f5be97f-9519-4d07-8c51-45ec9968171f
REPORT RequestId: 0f5be97f-9519-4d07-8c51-45ec9968171f Duration: 10010.52 ms Billed Duration: 10000 ms Memory Size: 512 MB Max Memory Used: 159 MB
2021-08-27T16:22:55.214Z 0f5be97f-9519-4d07-8c51-45ec9968171f Task timed out after 10.01 seconds
It looks like they (AWS team) have fixed it!
Now, it keeps track of the timeout you had previously saved \o/

AWS Lambda - Timeout

I have a simple lambda function which calculates a result asynchronously. I can log the result and it seems to be correct but for some reason the lambda function doesn't return successfully, like I am getting a timeout. If you look at the timestamps you can see that the result is calculated way before the timeout. The weird thing is that it works fine when I am using axios but whenever I use fauna it doesn't work anymore, but it does log the correct result... I have been sitting on this problem for days and have no clue what to do. I am using the serverless framework along with this template.
Response
{
"errorMessage": "2021-03-10T07:11:11.567Z 0180b87e-e01f-4527-8c7e-4c1dd5e3e354 Task timed out after 6.01 seconds"
}
Function Logs
START RequestId: 0180b87e-e01f-4527-8c7e-4c1dd5e3e354 Version: $LATEST
2021-03-10T07:11:05.811Z 0180b87e-e01f-4527-8c7e-4c1dd5e3e354 INFO Sending response: { statusCode: 200, body: '{"result":100}' }
END RequestId: 0180b87e-e01f-4527-8c7e-4c1dd5e3e354
REPORT RequestId: 0180b87e-e01f-4527-8c7e-4c1dd5e3e354 Duration: 6007.06 ms Billed Duration: 6000 ms Memory Size: 256 MB Max Memory Used: 76 MB Init Duration: 205.66 ms
2021-03-10T07:11:11.567Z 0180b87e-e01f-4527-8c7e-4c1dd5e3e354 Task timed out after 6.01 seconds
Any help would be much appreciated!
Found the issue. Within the handler I set the context.callbackWaitsForEmptyEventLoop = false. Alternatively when using middy you can use this middleware

AWS Lamba constantly receiving empty SQS event messages

I'm a relative n00b at AWS, so apologise if this is a stupid question.
I have an AWS Lambda written in Java. I also have an SQS queue that receives AWS S3 event messages. I've then created a Lambda trigger against the SQS queue so that my Lambda receives the S3 events as SQS messages and processes them appropriately.
It all works well. The only issue I have is that it seems like the Lambda is receiving notification of an SQS event message every 2 minutes, even when there are no messages in the SQS queue.
The Java code looks like this:
public class SQSEventHandler implements RequestHandler<SQSEvent, Void> {
#Override
public Void handleRequest(SQSEvent sqsEvent, Context context) {
if (sqsEvent != null) {
LOGGER.debug("Received SQS event: {}", sqsEvent.toString());
... do stuff...
If I look in the CloudWatch logs (I log using SLF4J), I can see that the Lambda is triggered with different SQS messages every 2 minutes, even during periods when there are no S3 event messages to process:
02:54:16 START RequestId: d5454080-8ea3-4c44-93e9-caa5bd903599 Version: $LATEST
02:54:16 [2020-02-13 02:54:16.220] - [d5454080-8ea3-4c44-93e9-caa5bd903599] DEBUG <package>.SQSEventHandler - Received SQS event: {}
02:54:16 END RequestId: d5454080-8ea3-4c44-93e9-caa5bd903599
02:54:16 REPORT RequestId: d5454080-8ea3-4c44-93e9-caa5bd903599 Duration: 1.05 ms Billed Duration: 100 ms Memory Size: 512 MB Max Memory Used: 161 MB
02:56:16 START RequestId: 9d5acbba-b96c-47e9-81c2-2d448e4ca6e9 Version: $LATEST
02:56:16 [2020-02-13 02:56:16.386] - [9d5acbba-b96c-47e9-81c2-2d448e4ca6e9] DEBUG <package>.SQSEventHandler - Received SQS event: {}
02:56:16 END RequestId: 9d5acbba-b96c-47e9-81c2-2d448e4ca6e9
02:56:16 REPORT RequestId: 9d5acbba-b96c-47e9-81c2-2d448e4ca6e9 Duration: 1.23 ms Billed Duration: 100 ms Memory Size: 512 MB Max Memory Used: 161 MB
02:58:16 START RequestId: 54bc4fa4-bcaf-4834-9185-09c9c7e2d757 Version: $LATEST
02:58:16 [2020-02-13 02:58:16.451] - [54bc4fa4-bcaf-4834-9185-09c9c7e2d757] DEBUG <package>.SQSEventHandler - Received SQS event: {}
02:58:16 END RequestId: 54bc4fa4-bcaf-4834-9185-09c9c7e2d757
02:58:16 REPORT RequestId: 54bc4fa4-bcaf-4834-9185-09c9c7e2d757 Duration: 1.01 ms Billed Duration: 100 ms Memory Size: 512 MB Max Memory Used: 161 MB
There are no other SQS queues with triggers to this Lambda.
As you can see, the SQS event object isn't null but doesn't produce anything in the toString() call.
I can't work out what the issue is - any assistance would be appreciated.
Unbeknownst to me, there was a CloudWatch rule set up to send a message to my Lambda every two minutes. Once this had been located, I disabled the rule and the Lambda was no longer triggered.
What did your Lambda function do with the SQS messages?
If it processed them to completion then you must delete them from the SQS queue, otherwise they will re-appear after the Visibility Timeout expires. This is the design for how SQS deals with applications receiving messages and then dying before they can complete processing of those messages.

Only logging errors with AWS Lambda

I'm trying to add few lines of code so that, when my AWS Lambda function fails, it logs when it fails and with which input parameters it did. Following the documentation, I added these lines:
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logger.info('user {0}'.format(event["user"]))
They generate some information that are accessible from CloudWatch:
08:50:29 - START RequestId: 92d000ad-b01f-11e8-98a6-c32aa1e3e890 Version: $LATEST
08:50:31 - [INFO] 2018-09-04T08:50:31.781Z 92d000ad-b01f-11e8-98a6-c32aa1e3e890 user xxxxxx
08:50:31 - END RequestId: 92d000ad-b01f-11e8-98a6-c32aa1e3e890
08:50:31 - REPORT RequestId: 92d000ad-b01f-11e8-98a6-c32aa1e3e890 Duration: 2513.04 ms Billed Duration: 2600 ms Memory Size: 896 MB Max Memory Used: 37 MB
However, it seems that every single call of the lambda function creates a log entry in the CloudWatch. As it is, it's impossible to identify the logs associated to failures of the function. Is it instead possible to create log entries only when logging writes the information? In alternative, can an S3 bucket be set to store log files (associated to errors)?
Any reason you are not utilizing the log level?
logger.setLevel(logging.ERROR)
If you need to log all events, and Cloudwatch would be a good place to do that, then you could consider creating a metric filter inside Cloudwatch\Logging, to create alerts for all entries having keyword 'error', as an example.

Lambda times out after ending

After finishing successfully, a Lambda function insists on timing out.
The function's triggering event is s3:ObjectCreated:*.
The function uses MongoDB Atlas and does so according to the optimisation suggestions on https://www.mongodb.com/blog/post/optimizing-aws-lambda-performance-with-mongodb-atlas-and-nodejs, including setting:
context.callbackWaitsForEmptyEventLoop = false; before using the DB.
The function also calls some AWS SDK methods with successfully resolved promises.
After finishing my code successfully and doing everything it set out to do, I get the following in my CloudWatch logs, (both the request's END event and its timeout):
START RequestId: XXX
... my logs...
END RequestId: XXX
REPORT RequestId: XXX Duration: 6001.12 ms Billed Duration: 6000 ms Memory Size: 1024 MB Max Memory Used: 49 MB
XXX Task timed out after 6.00 seconds
The function then repeats itself twice more with the same unfortunate result.
Any immediate suspects? Where should I look?
You need to call callback(null, <any>) in order to end your function handler and tell Lambda that your function executed successfully.
Without that, Lambda will retry the same invocation after a delay and it will again finish but without telling Lambda that it finished successfully.