Only logging errors with AWS Lambda - amazon-web-services

I'm trying to add few lines of code so that, when my AWS Lambda function fails, it logs when it fails and with which input parameters it did. Following the documentation, I added these lines:
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logger.info('user {0}'.format(event["user"]))
They generate some information that are accessible from CloudWatch:
08:50:29 - START RequestId: 92d000ad-b01f-11e8-98a6-c32aa1e3e890 Version: $LATEST
08:50:31 - [INFO] 2018-09-04T08:50:31.781Z 92d000ad-b01f-11e8-98a6-c32aa1e3e890 user xxxxxx
08:50:31 - END RequestId: 92d000ad-b01f-11e8-98a6-c32aa1e3e890
08:50:31 - REPORT RequestId: 92d000ad-b01f-11e8-98a6-c32aa1e3e890 Duration: 2513.04 ms Billed Duration: 2600 ms Memory Size: 896 MB Max Memory Used: 37 MB
However, it seems that every single call of the lambda function creates a log entry in the CloudWatch. As it is, it's impossible to identify the logs associated to failures of the function. Is it instead possible to create log entries only when logging writes the information? In alternative, can an S3 bucket be set to store log files (associated to errors)?

Any reason you are not utilizing the log level?
logger.setLevel(logging.ERROR)
If you need to log all events, and Cloudwatch would be a good place to do that, then you could consider creating a metric filter inside Cloudwatch\Logging, to create alerts for all entries having keyword 'error', as an example.

Related

how to follow the each request of lambda log

Aws lambda makes the log files as log stream
I exported the logs to S3 and downloaded them.
However in that files timestamp dosen't sorted in order.
And sometimes the same request are separated.
Normally
2022-08-08T04:08:17.808Z START RequestId: 6ec642a2-84d6-4e09-8155-ad8c0cbefd52 Version: $LATEST
...
...
...
2022-08-08T04:08:18.688Z END RequestId: 6ec642a2-84d6-4e09-8155-ad8c0cbefd52
2022-08-08T04:08:18.688Z REPORT RequestId: 6ec642a2-84d6-4e09-8155-ad8c0cbefd52 Duration: 878.21 ms Billed Duration: 879 ms Memory Size: 128 MB Max Memory Used: 114 MB
Sometimes it is separated.
2022-08-08T04:08:17.808Z START RequestId: 6ec642a2-84d6-4e09-8155-ad8c0cbefd52 Version: $LATEST
...
...
...
other RequestID's comes in here.
other logs are listed here
...
2022-08-08T04:08:18.688Z END RequestId: 6ec642a2-84d6-4e09-8155-ad8c0cbefd52
2022-08-08T04:08:18.688Z REPORT RequestId: 6ec642a2-84d6-4e09-8155-ad8c0cbefd52 Duration: 878.21 ms Billed Duration: 879 ms Memory Size: 128 MB Max Memory Used: 114 MB
It is quite confusing and difficult to follow the each request log.
I am thinking to right the script to parse the log and set the each RequestID.
However, I am afraid it is the invention of wheel.
Is there any practical behaivor to parse the lambda logs?
CloudWatch Log Insights console can be used to filter the requests by ID. If you have your requestId you can select your lambda as a log group and write a short query (just be sure to set the correct time frame in the top right).
Something like this would work:
fields #timestamp, #message
| sort #timestamp asc
| filter #requestId = "6ec642a2-84d6-4e09-8155-ad8c0cbefd52"

AWS Lambda instance runs for more than 15 minutes

I have a lambda that is running an API, so every instance of the lambda starts a flask server, which runs until the lambda is killed. When another request is received, a new instance of the lambda is created- starting the flask server again.
Looking at the logs from one of my lambdas, a log stream ran for more than 15 minutes (~20 minutes before a process failed). The timeout is set to 5 min 0 seconds.
From what I understand, one long stream is generated per instance, an instance is killed after the timeout has been reached, and no timeout can be set for more than 15 minutes.
How do I have logs that span for ~20 minutes in the same log stream? Is the timeout the amount of time a lambda can run without generating logs before being killed? Is there a way to limit the time a lambda instance runs before the instance is terminated?
Initial log timestamp: 2022-02-20T11:05:05.970-07:00
Final log timestamp: 2022-02-20T11:25:29.895-07:00
Lambda Timeout: 5min0sec
START, END, REPORT statements -- I have multiple of these per log stream:
2022-02-20T11:05:06.509-07:00
START RequestId: 6807306e-f05f-425b-bd75-985b42794eaa
2022-02-20T11:05:07.203-07:00
END RequestId: 6807306e-f05f-425b-bd75-985b42794eaa
2022-02-20T11:05:07.203-07:00
REPORT RequestId: 6807306e-f05f-425b-bd75-985b42794eaa Duration: 691.58 ms Billed Duration: 2856 ms Memory Size: 128 MB Max Memory Used: 119 MB Init Duration: 2163.68 ms
XRAY TraceId: 1-621282cf-00745c0f030b1d810fad9710 SegmentId: 3effe66c64dc69d4 Sampled: true
2022-02-20T11:05:07.767-07:00
START RequestId: 8c724d99-8640-4634-88b9-1efb774c54a9 Version: $LATEST
2022-02-20T11:05:07.845-07:00
END RequestId: 8c724d99-8640-4634-88b9-1efb774c54a9
2022-02-20T11:05:07.845-07:00
REPORT RequestId: 8c724d99-8640-4634-88b9-1efb774c54a9 Duration: 76.09 ms Billed Duration: 77 ms Memory Size: 128 MB Max Memory Used: 119 MB
XRAY TraceId: 1-621282d3-55a51f9d69552d8905b959b7 SegmentId: 47dc3b533278679f Sampled: true
2022-02-20T11:05:08.182-07:00
START RequestId: ee69e296-af07-49a2-8264-ed9758d03992 Version: $LATEST
2022-02-20T11:05:08.263-07:00
END RequestId: ee69e296-af07-49a2-8264-ed9758d03992
2022-02-20T11:05:08.263-07:00
REPORT RequestId: ee69e296-af07-49a2-8264-ed9758d03992 Duration: 78.26 ms Billed Duration: 79 ms Memory Size: 128 MB Max Memory Used: 119 MB
XRAY TraceId: 1-621282d4-7c8f7ea75ead3f7c166a44a6 SegmentId: 2dcaed42621dbebd Sampled: true
... the final start/end that fails.
2022-02-20T11:25:29.749-07:00
START RequestId: 5521e958-0ffb-4de7-b028-953a21ac2ac9 Version: $LATEST
2022-02-20T11:25:29.895-07:00
END RequestId: 5521e958-0ffb-4de7-b028-953a21ac2ac9
2022-02-20T11:25:29.895-07:00
REPORT RequestId: 5521e958-0ffb-4de7-b028-953a21ac2ac9 Duration: 142.60 ms Billed Duration: 143 ms Memory Size: 128 MB Max Memory Used: 119 MB
XRAY TraceId: 1-62128799-19bd91ab324cf0271e964434 SegmentId: 554bd90b4a8ba1c5 Sampled: true
I think your misconception here is When another request is received, a new instance of the lambda is created- starting the flask server again. This is not necessarily true. According to the documentation:
The first time you invoke your function, AWS Lambda creates an instance of the function and runs its handler method to process the event. When the function returns a response, it stays active and waits to process additional events.
This makes perfect sense, especially for an http api. If a new lambda instance had to spin up every request, database connections, caching layers, etc would all have to be reestablished. This would not be scalable or performant. So AWS is doing you a favor by keeping the lambda instance up and running, even though you only pay for the time it's actually working.
As for the timeout question, the timeout is the max time that the lambda will complete a single job, in your case responding to an API request. Not how long a lambda will stay active.
A single request will be terminated after 15 min, but the instance, where the request was worked on, stays for 4.5 hours.(last time I checked at least)
and it will write every request in the same log stream.
If another request comes in, when the lambda instance is "alive" and not busy, the same instance will be reused, and will write in the same log stream, as the previous call.
Only if no lambda instance is free to work on request, a new instance will be started.
This is the reason, why the "warm up" calls for some bad designed lambdas with long start time work.

DynamoDB Trigger Lambda Function PROBLEM: Function call failed

I enabled streams on my dynamoDB table. As items are modified, a lambda function is triggered. I think I set up everything correctly both on the lambda trigger side, permissions, and dynamodb side. I also ran my lambda function with test data and it succeeded. However, when items are modified in the table, the trigger did not start my lambda function. Instead, I got the following error:
Batch size: 100 Last processing result: PROBLEM: Function call failed
Any idea what's the best way to debug this? I went on CloudWatch logs but there were no logs associated with the trigger/stream.
Thanks.
Edit: Logs for the lambda function (not its dynamodb trigger). The trigger didn't generate any log statements.
START RequestId: 3a08eedc-f0de-11e8-9008-033b48d2cb67 Version: $LATEST

18:16:28
END RequestId: 3a08eedc-f0de-11e8-9008-033b48d2cb67

18:16:28
REPORT RequestId: 3a08eedc-f0de-11e8-9008-033b48d2cb67 Duration: 81.85 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 30 MB
I ran into this issue today.
I debugged it by manually triggering the lambda with the Test button on the top of the main lambda page. It showed the error output trying to run my lambda.
The reason I had an error was the handler parameter as I had a non-standard javascript function name and I forgot to configure that in my lambda.
In my case, my lambda role did not have permissions to write to the SNS and the lambda code was writing to a SNS. So i added a policy to the lambda role giving it permissions to write to any SNS topic.
In my case, the problem came from the stream batch size 100. In the lambda code I was checking the event and I exit if the event doesn't meet the requirement.
In my case, the handler method was not configured correctly for the Lambda function written in Java. The format that I used for setting the handler is as follows
packageName.className::handlerMethod
For example, for my handler class
package com.example;
public class App implements
RequestHandler<DynamodbEvent, String> {
public String handleRequest(DynamodbEvent ddbEvent, Context context) {
the handler should be defined as
com.example.App::handleRequest
This sounds like a possible use-case for Rookout if you need to follow variable values in your live Lambda in a situation where you're not able to generate logs and running it locally isn't going to give you real-world event trigger data.

Lambda times out after ending

After finishing successfully, a Lambda function insists on timing out.
The function's triggering event is s3:ObjectCreated:*.
The function uses MongoDB Atlas and does so according to the optimisation suggestions on https://www.mongodb.com/blog/post/optimizing-aws-lambda-performance-with-mongodb-atlas-and-nodejs, including setting:
context.callbackWaitsForEmptyEventLoop = false; before using the DB.
The function also calls some AWS SDK methods with successfully resolved promises.
After finishing my code successfully and doing everything it set out to do, I get the following in my CloudWatch logs, (both the request's END event and its timeout):
START RequestId: XXX
... my logs...
END RequestId: XXX
REPORT RequestId: XXX Duration: 6001.12 ms Billed Duration: 6000 ms Memory Size: 1024 MB Max Memory Used: 49 MB
XXX Task timed out after 6.00 seconds
The function then repeats itself twice more with the same unfortunate result.
Any immediate suspects? Where should I look?
You need to call callback(null, <any>) in order to end your function handler and tell Lambda that your function executed successfully.
Without that, Lambda will retry the same invocation after a delay and it will again finish but without telling Lambda that it finished successfully.

AWS Lambda: Identifying cold starts

Is there a clear way to identify "cold starts"? Either in runtime in the Lambda itself, or via the logs? I know that cold starts are characterized by longer runtimes, which I can actually see, but I'm looking for a clear cut way.
I'm using Node.js if that matters.
Update: There are two good answers below, for two use cases:
- Identifying the cold start as the lambda runs.
- Identifying the cold start from the CloudWatch log.
If you add some initialization code to the top of your NodeJS script, you will be able to tell in the code that it is a cold start, and you will then be able to log that if you want to see it in the logs. For example:
var coldStart = true;
console.log("This line of code exists outside the handler, and only executes on a cold start");
exports.myHandler = function(event, context, callback) {
if (coldStart) {
console.log("First time the handler was called since this function was deployed in this container");
}
coldStart = false;
...
callback(...);
}
Update:
If you only care about seeing cold starts in the logs, Lambda now logs an extra "Init Duration" value in CloudWatch Logs for cold starts.
As an update, AWS now provide visible info on cold starts in the form of "Init Duration" , inside the Report section of a Cloudwatch Log. The calls that do not suffer from a cold start will not contains this information in the log
Duration: 1866.19 ms Billed Duration: 1900 ms Memory Size: 512 MB Max Memory Used: 163 MB Init Duration: 2172.14 ms
If you're looking at CloudWatch logs, each LogGroup for your Lambda function represents a separate container and therefore the first invocation for that LogGroup is your cold start.