Internal Error with Lambda function and MSK service

Internal Error with Lambda function and MSK service - amazon-web-services

Batch size: 100
Last processing result: PROBLEM: Lambda internal error. Please contact Lambda customer support.
Starting position: TRIM_HORIZON
We have a lambda function that consumes a MSK (kafka server) messages. It is working without problem.
But sometimes it stopped working with no reason (no logs) and only shows:
PROBLEM: Lambda internal error. Please contact Lambda customer
support.
We log almost each line of code (lambda function), There is no error.
We don't have paid support service. But I think if it is related to internal AWS error, They must provide free support. Anyways, Can we solve it?

Related

Mysterious 500 error with AWS Lambda; unable to debug

I have an API that I host using Lambda (nodejs), with API-gateway. I'm using serverless to deploy.
Generally things have been fine, but while I was working on a specific function today, I started to receive HTTP 500 errors when hitting the endpoint. However, while there were still API-Gateway access logs for the end point, there were no Cloudwatch logs for the lambda functions getting hit. I was able to verify that the Authorizer was getting hit successfully, and not returning any issue (if it was, it would have been a 401). After using CLI tools to invoke the function from the command line, the 500 error went away and I was able to successfully hit the endpoints again.
Has anyone ever ran into this before? If I'm missing a debug step, I would really like to know. It was really concerning that my API could be generating 500 errors with no paper trail to help me understand what was happening.

You can check your role and permissions ,this link could help you https://aws.amazon.com/premiumsupport/knowledge-center/api-gateway-lambda-stage-variable-500/
Also you can debug further with X-ray : https://docs.aws.amazon.com/lambda/latest/dg/services-xray.html

Intermittent Internal Server Error - StatusCode 500 on API Gateway calling Lambda

I have a REST API in AWS API Gateway that invokes a Python Lambda function and returns some result.
Most of the times this workflow works fine, meaning that the Lambda function is executed and passes the result back to the API, which in turn returns a 200 OK response.
However, there are few times in which I get a 500 error code from the API and the Lambda seems not to be even executed. The response.reason says: "Internal Server Error" and no additional information is given.
There is no difference between the failing requests and the successful ones to the API in terms of the method or parameters format.
One more comment is that the API has the cache setting enabled.
I've seen similar posts and some of the answers mention the format of the JSON object returned by the Lambda function, others point to IAM permissions issues, but none of those seem to be the cause here. In fact, as this post's title says this is an intermittent behavior: most of the times it works fine, but occasionally I get this error.
Any hint would be highly appreciated.

I have the same problem and in my case I had to enable Log full requests/responses data together with INFO logs on the API Gateway stage to see the following logs:
(xxx) Endpoint response body before transformations:
{
"Type": "Service",
"message": "INFO: Lambda is initializing your function. It will be ready to invoke shortly."
}
In my case the issue was related to the fact that the lambda was in Inactive state, which happens If a function remains idle for several weeks.

I have the same problem and I suspect a timeout maybe due to lambda reaching its memory limit.
I have set the memory limit to the next notch (128 -> 512) and augmented the timeout to 10s (default is 3), and now I'm able to see the timeout in action.
I still have the problem for the moment but now I'll be able to investigate.
I hope that this helps you.

I see this with a HTTP API integration. It's intermittent, and it appears to improve when adding provisioned concurrency to the Lambda. For example, on a Lambda that has between 4 and 10 concurrent instances, but usually hovers in the 4 to 8 range, purchasing between 5 and 6 provisioned concurrent instances helped reduce, possibly eliminate, these 500 errors.
I am still monitoring to see whether they are gone for good. The frequency of these errors has gone down drastically with the provisioned instances.

DynamoDB Trigger Lambda Function PROBLEM: Function call failed

I enabled streams on my dynamoDB table. As items are modified, a lambda function is triggered. I think I set up everything correctly both on the lambda trigger side, permissions, and dynamodb side. I also ran my lambda function with test data and it succeeded. However, when items are modified in the table, the trigger did not start my lambda function. Instead, I got the following error:
Batch size: 100 Last processing result: PROBLEM: Function call failed
Any idea what's the best way to debug this? I went on CloudWatch logs but there were no logs associated with the trigger/stream.
Thanks.
Edit: Logs for the lambda function (not its dynamodb trigger). The trigger didn't generate any log statements.
START RequestId: 3a08eedc-f0de-11e8-9008-033b48d2cb67 Version: $LATEST

18:16:28
END RequestId: 3a08eedc-f0de-11e8-9008-033b48d2cb67

18:16:28
REPORT RequestId: 3a08eedc-f0de-11e8-9008-033b48d2cb67 Duration: 81.85 ms Billed Duration: 100 ms Memory Size: 128 MB Max Memory Used: 30 MB

I ran into this issue today.
I debugged it by manually triggering the lambda with the Test button on the top of the main lambda page. It showed the error output trying to run my lambda.
The reason I had an error was the handler parameter as I had a non-standard javascript function name and I forgot to configure that in my lambda.

In my case, my lambda role did not have permissions to write to the SNS and the lambda code was writing to a SNS. So i added a policy to the lambda role giving it permissions to write to any SNS topic.

In my case, the problem came from the stream batch size 100. In the lambda code I was checking the event and I exit if the event doesn't meet the requirement.

In my case, the handler method was not configured correctly for the Lambda function written in Java. The format that I used for setting the handler is as follows
packageName.className::handlerMethod
For example, for my handler class
package com.example;
public class App implements
RequestHandler<DynamodbEvent, String> {
public String handleRequest(DynamodbEvent ddbEvent, Context context) {
the handler should be defined as
com.example.App::handleRequest

This sounds like a possible use-case for Rookout if you need to follow variable values in your live Lambda in a situation where you're not able to generate logs and running it locally isn't going to give you real-world event trigger data.

Authentication with Cognito - where to find logs

We have 2 React Native app are using AWS Cognito for authentication. We use library react-native-aws-cognito-js in our code. The apps are working fine until these 2 days. Apps are experiencing intermittent "Internal Server Error".
How can I find more information about this error? Any tool can help us pinpoint the cause?
Update
From CloudTrail, each API call has an event "CreateNetworkInterface". Many of such API calls have error code "Client.NetworkInterfaceLimitExceeded". What is the cause and solution to this?
According to this AWS Doc (in Chinese), CloudWatch will not write to log when error is due to insufficient IP/ENI. That explains the increase in error number but no logs in CloudWatch.
Upate 2
We have found a scheduled Lambda job which may exhausted IP addresses. We stopped the batch job. But still can't have too many user login to server due to "Client.NetworkInterfaceLimitExceeded" error. I realized that there are many "CreateNetworkInterface" event and few "DeleteNetworkInterface" event. How can I "clean up / reset" all network interface in VPC?

Short answer: Cloud Trail.
Long answer with a suggestion
Assuming your application code is fine, most likely the cause of your 500 error is based on Cognito's initial limitations (e.g., number of calls per user): https://docs.aws.amazon.com/cognito/latest/developerguide/limits.html.
AWS suggests to use Cloud Trail, for logging Api calls.
However I would suggest, to prove the limitations first, add some logs around the api call yourself, and in development you could call your app/api with a high number of calls; and most likely you will see the 500 error due to the limitations.
You could do the following in the terminal:
for i in `seq 1 1000`; do curl --cookie SecureCookie=TokenValueFromAWS http://localhost:desirablePort/SecuredPath; done

aws lambda function triggering multiple times for a single event

I am using aws lambda function to convert uploaded wav file in a bucket to mp3 format and later move file to another bucket. It is working correctly. But there's a problem with triggering. When i upload small wav files,lambda function is called once. But when i upload a large sized wav file, this function is triggered multiple times.
I have googled this issue and found that it is stateless, so it will be called multiple times(not sure this trigger is for multiple upload or a same upload).
https://aws.amazon.com/lambda/faqs/
Is there any method to call this function once for a single upload?

Short version:
Try increasing timeout setting in your lambda function configuration.
Long version:
I guess you are running into the lambda function being timed out here.
S3 events are asynchronous in nature and lambda function listening to S3 events is retried atleast 3 times before that event is rejected. You mentioned your lambda function is executed only once (with no error) during smaller sized upload upon which you do conversion and re-upload. There is a possibility that the time required for conversion and re-upload from your code is greater than the timeout setting of your lambda function.
Therefore, you might want to try increasing the timeout setting in your lambda function configuration.
By the way, one way to confirm that your lambda function is invoked multiple times is to look into cloudwatch logs for the event id (67fe6073-e19c-11e5-1111-6bqw43hkbea3) occurrence -
START RequestId: 67jh48x4-abcd-11e5-1111-6bqw43hkbea3 Version: $LATEST
This event id represents a specific event for which lambda was invoked and should be same for all lambda executions that are responsible for the same S3 event.
Also, you can look for execution time (Duration) in the following log line that marks end of one lambda execution -
REPORT RequestId: 67jh48x4-abcd-11e5-1111-6bqw43hkbea3 Duration: 244.10 ms Billed Duration: 300 ms Memory Size: 128 MB Max Memory Used: 20 MB
If not a solution, it will at least give you some room to debug in right direction. Let me know how it goes.

Any event Executing Lambda several times is due to retry behavior of Lambda as specified in AWS document.
Your code might raise an exception, time out, or run out of memory. The runtime executing your code might encounter an error and stop. You might run out concurrency and be throttled.
There could be some error in Lambda which makes the client or service invoking the Lambda function to retry.
Use CloudWatch logs to find the error and resolving it could resolve the problem.
I too faced the same problem, in my case it's because of application error, resolving it helped me.
Recently AWS Lambda has new property to change the default Retry nature. Set the Retry attempts to 0 (default 2) under Asynchronous invocation settings.

For some in-depth understanding on this issue, you should look into message delivery guarantees. Then you can implement a solution using the idempotent consumers pattern.
The context object contains information on which request ID you are currently handling. This ID won't change even if the same event fires multiple times. You could save this ID for every time an event triggers and then check that the ID hasn't already been processed before processing a message.

In the Lambda Configuration look for "Asynchronous invocation" there is an option "Retry attempts" that is the maximum number of times to retry when the function returns an error.
Here you can also configure Dead-letter queue service

Multiple retry can also happen due read time out. I fixed with '--cli-read-timeout 0'.
e.g. If you are invoking lambda with aws cli or jenkins execute shell:
aws lambda invoke --cli-read-timeout 0 --invocation-type RequestResponse --function-name ${functionName} --region ${region} --log-type Tail --```payload {""} out --log-type Tail \

I was also facing this issue earlier, try to keep retry count to 0 under 'Asynchronous Invocations'.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Internal Error with Lambda function and MSK service - amazon-web-services

Related

Mysterious 500 error with AWS Lambda; unable to debug

Intermittent Internal Server Error - StatusCode 500 on API Gateway calling Lambda

DynamoDB Trigger Lambda Function PROBLEM: Function call failed

Authentication with Cognito - where to find logs

aws lambda function triggering multiple times for a single event

Categories

Resources