Idempotent AWS lambda does not execute code on cold start - amazon-web-services

Problem
When I add the Idempotency configuration of aws-lambda-powertools my function code is not executed propertly.
The AWS lambda serves as message handler for a MS Teams chatbot when the function performs a cold start the async code within the handler is not executed and no message is returned to the user. I also don't see any logs so it seems that the code in the async handler is not executed at all.
Could this be due to the way I handle my async handler?
Code
#idempotent(persistence_store=persistence_layer, config=cfg)
def lambda_handler(event:dict, context: dict):
asyncio.get_event_loop().run_until_complete(lambda_messages(event))
payload = json.loads(event["body"])
return {"status": 400, "payload": payload}

The issue was due to the timeout of my aws sam function not being configured properly. Because of aws-labmda-powertools it was hard to debug as the error was not easily vissible.

Related

Lambda function triggered by ALB but no response

I am using ALB as a trigger for lambda function, when I post request to the alb, I can see the trigger request in cloud watch, however, there's no response to the post request I posted using postman.
I added logs in the code to check it enters the lambda or not but I can't see python logs. Also, I added a role with ElasticLoadBalancingFullAccess to lambda. but still no response; I am not sure how to debug or move on further I tried multiple things, I even added context.done(respons) to the lambda handler, I also changed the format to be json format returns the status code and the body. Any insights will be appreciated.
EDIT:
details about ALB:
listeners: port 80
target: lambda type and I choose my lambda function
security group: simple security groups allow public access (it works fine as I am triggering the lambda by the request)
lambda code:
def lambda_handler(event,context):
# Initialize you log configuration using the base class
context.succeed({
'statusCode': 200,
'body': json.dumps("wuccedd")
})
also I noticed when I put an error intentially in the lambda function like this forexample:
def lambda_handler(event,context):
#error
x= 10/0,,,,,
context.succeed({
'statusCode': 200,
'body': json.dumps("wuccedd")
})
and this also kept the request stuck, which means it doesn't enter the handler function, any idea why the function can be triggered in cloud watch but the handler function isn't entered

Boto3 invocations of long-running Lambda runs break with TooManyRequestsException

Experience with "long-running" Lambda's
In my company, we recently ran into this behaviour, when triggering Lambdas, that run for > 60 seconds (boto3's default timeout for connection establishment and reads).
The beauty of the Lambda invocation with boto3 (using the 'InvocationType' 'RequestResponse') is, that the API returns the result state of the respective Lambda run, so we wanted to stick to that.
The issue seems to be, that the client fires to many requests per minute on the standing connection to the API. Therefore, we experimented with the boto3 client configuration, but increasing the read timeout resulted in new (unwanted) invocations after each timeout period and increasing the connection timeout triggered a new invocation, after the Lambda was finished.
Workaround
As various investigations and experimentation with boto3's Lambda client did not result in a working setup using 'RequestResponse' invocations,
we circumvented the problem now by making use of Cloudwatch logs. For this, the Lambda has to be setup up to write to an accessible log group. Then, these logs can the queried for the state. Then you would invoke the Lambda and monitor it like this:
import boto3
lambda_client = boto3.client('lambda')
logs_clients = boto3.client('logs')
invocation = lambda_client.invoke(
FunctionName='your_lambda',
InvocationType='Event'
)
# Identifier of the invoked Lambda run
request_id = invocation['ResponseMetadata']['RequestID']
while True:
# filter the logs for the Lambda end event
events = logs_client.filter_log_events(
logGroupName='your_lambda_loggroup',
filterPattern=f'"END RequestId: {request_id}"'
).get('events', [])
if len(events) > 0:
# the Lambda invocation finished
break
This approach works for us now, but it's honestly ugly. To make this approach slightly better, I recommend to set the time range filtering in the filter_log_events call.
One thing, that was not tested (yet): The above approach only tells, whether the Lambda terminated, but not the state (failed or successful) and the default logs don't hold anything useful in that regards. Therefore, I will investigate, if a Lambda run can know its own request id during runtime. Then the Lambda code can be prepared to also write error messages with the request id, which then can be filtered for again.

Attach additional info to Lambda time-out message?

When a Lambda times out, it outputs a message to CloudWatch (if enabled) saying "Task timed out".
It would be beneficial to attach additional info (such as the context of the offending call) to the message. Right now I'm writing the context to CloudWatch at the start of the invocation - but it would sometimes be preferable if everything was contained within a single message.
Is something like that possible?
Unfortunately there is no almost-timed-out-hook. You may however be able to inspect the context object you get in the Lambda handler to look at the remaining run time and if it gets close to timing out printing out the additional info.
In python you could use context.get_remaining_time_in_millis() as per the documentation to get that info.
There is no timeout hook for lambda but can be implemented with a little bit of code
import signal
def handler(event, context):
....
signal.alarm((context.get_remaining_time_in_millis())
.....
def timeout_handler(_signal, _frame):
raise Exception('other information')
We implemented something like this for a lot of custom handlers in cloudformation.

"LAMBDA_RUNTIME" Error on high-volume Lambda Function

I'm currently using a Lambda Function written in Javascript that is setup with an SQS event source to automatically pull messages from an SQS Queue and do some basic processing on the message contents. I cannot show the code but the summary of the lambda function's execution is basically:
For each message in the batch it receives as part of the event:
It parses the body, which is a JSON string, into a Javascript object.
It reads an object from S3 that is listed in the object using getObject.
It puts a record into a DynamoDB table using put.
If there were no errors, it deletes the individual SQS message that was processed from the Queue using deleteMessage.
This SQS queue is high-volume and receives messages in-bulk, regularly building up a backlog of millions of messages. The Lambda is normally able to scale to process hundreds of thousands of messages concurrently. This solution has worked well for me with other applications in the past but I'm now encountering the following intermittent error that reliably begins to appear as the Lambda scales up:
[ERROR] [#############] LAMBDA_RUNTIME Failed to post handler success response. Http response code: 400.
I've been unable to find any information anywhere about what this error means and what causes it. There appears to be not discernible pattern as to which executions encounter it. The function is usually able to run for a brief period without encountering the error and scale to expected levels. But then, as you can see, the error starts to appear quite suddenly and completely destroys the Lambda throughput by forcing it to auto-scale down:
Does anyone know what this "LAMBDA_RUNTIME" error means and what might cause it? My Lambda Function runtime is Node v12.
Your function is being invoked asynchronously, so when it finishes it signals the caller if it was sucessful.
You should have an error some milliseconds earlier, probably an unhandled exception not being logged. If that's the case, your functions ends without knowing about the exception and tries to post a success response.
I have this error only that I get:
[ERROR] [1638918279694] LAMBDA_RUNTIME Failed to post handler success response. Http response code: 413.
I went to the lambda function on aws console and ran the test with a custom event I build and the error I got there was:
{
"errorMessage": "Response payload size exceeded maximum allowed payload size (6291556 bytes).",
"errorType": "Function.ResponseSizeTooLarge"
}
So this is the actual error that cloudwatch doesn't return but the testing section of the lambda function console do.
I think I'll have to return info to an S3 file or something, but that's another matter.

Getting the response results from an asynchronous call to AWS lambda in python

I have an AWS lambda function which I can call synchronously and get results back alright with below code
response = lambda_client.invoke(
FunctionName=FUNCTION_NAME,
InvocationType='RequestResponse',
LogType='Tail',
Payload=payload,
Qualifier=$LATEST
)
The response Payload is of type <botocore.response.StreamingBody object at 0x115fb3160> So I use below code to extract the payload which works fine.
response_body = response['Payload']
response_str = response_body.read().decode('utf-8')
response_dict = eval(response_str)
Now, I need to call my lambda asynchronously, so I change the invocation type with InvocationType='Event'
It gives me a response with payload of the same type as before, botocore.response.StreamingBody object but I am getting error with this line - response_dict = eval(response_str)
The error message says
response_dict = eval(response_str)
File "<string>", line 0
^
SyntaxError: unexpected EOF while parsing
What am I missing? If the response payload is same type as synchronous call, why is this parsing error? Any suggestion?
EDIT
For clarity, I understand that if the InvocationType='Event', then we only get the status of the invoke call, not the lambda function result. In my case though, I need both - launch the lambda async and get the result back when done. How do I do that? Is writing the result back to s3 and periodically checking that the only option?
InvocationType='Event' means you aren't getting a response. An asynchronous Lambda invocation means you just want to invoke the function, not wait for the response. The response payload from the function is discarded by the service.
When you invoke a function asynchronously, Lambda sends the event to a queue. A separate process reads events from the queue and runs your function. When the event is added to the queue, Lambda returns a success response without additional information. (emphasis added)
https://docs.aws.amazon.com/lambda/latest/dg/invocation-async.html
Note that the queue mentioned here is a queue inside the Lambda service, not to be confused with Amazon Simple Queue Service (SQS).