AWS Lambda throttled, but no evidence in the metrics - amazon-web-services

When running Lambda in high concurrency, I receive error CloudWatch logs below (which I haven't seen anywhere else on the web!).
Execution failed due to configuration error: Lambda was throttled while using the Lambda Execution Role to set up for the Lambda function
When I check the "Throttled invocations" metric, it doesn't show these throttles.
Why doesn't the metric show these throttles? Has anyone seen this throttle error before? It is not the usual throttle error.

For me it was API Gateway throttling that caused this issue even though I wasn't even near throttling limits - after removing limit on API GW(going back to default settings) I haven't faced this issue anymore

Related

Resolve Performance Issues with NodeJS AWS Lambda API

I am new to AWS and having some difficulty tracking down and resolving some latency we are seeing on our API. Looking for some help diagnosing and resolving the issue.
Here is what we are seeing:
If an endpoint hasn't been hit recently, then on the first request we see a 1.5-2s delay marked as "Initialization" in the CloudWatch Trace.
I do not believe this is a cold start, because each endpoint is configured to have 1 provisioned concurrency, so we should not get a cold start unless there are 2 simultaneous requests. Also, the billed duration includes this initialization period.
Cold start means when your first request hit to aws lambda it will be prepared container to run your scripts,this will take some time and your request will delay.
When second request hit lambda and lambda and container is already up and runing will be process quickly
https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/
This is the default behavior of cold start, but since you said that you were using provisioned concurrency, that shouldn't happen.
Provisioned concurrency has a delay to activate in the account, you can follow this steps to verify if this lambda used on demand or provisioned concurrency.
AWS Lambda sets an ENV called AWS_LAMBDA_INITIALIZATION_TYPE that contain the values on_demand or provisioned_concurrency.

Handling failure scenarios of AWS Lambda

If my lambda function fails, is there any way in AWS to invoke the same function after 3-4 hours.
If yes what would be the flow to do so?
It depends on the failure. Clients such as the AWS CLI and the AWS SDK retry on client timeouts, throttling errors (429), and other errors that aren't caused by a bad request. Read more about auto retries in here.
If you want a custom retry logic, Dead Letter Queue can be an option. See more details in here https://aws.amazon.com/pt/blogs/compute/robust-serverless-application-design-with-aws-lambda-dlq/
Or you can use CloudWatch event to trigger on lambda failure.
Here is a good article explains that approach.
https://aws.amazon.com/blogs/mt/get-notified-specific-lambda-function-error-patterns-using-cloudwatch/

Lambda returning Http 200 on timeout to API Gateway

I have a lambda function for which the timeout is set to 10 seconds. This lambda is triggered from an API Gateway. Now in my case, I could see in the cloudwatch logs that I am getting a Time out error Task timed out after 10.00 seconds which is fine. But the Response code I am getting in my API gateway logs is Http-200.
I read few AWS docs and answers on Stack Overflow regarding this issue that if this is something which is expected or there is some issue with my code, but none of them seems to give clear answer as many of the questions are too old to follow.
Also I did not find anything substantial in AWS docs as well.
As per AWS,
For Lambda custom integrations, you must map errors returned by Lambda
in the integration response to standard HTTP error responses for your
clients. Otherwise, Lambda errors are returned as 200 OK responses by
default and the result is not intuitive for your API users.
Error Handling here
You have to explicitly handle such errors.
I recently stumbled across this as well.
In my scenario I was generally wondering what API Gateway returns when the lambda execution times out, runs out of memory etc. and found an answer in the aws forum:
AWS Forum
The Lambda error regex is only applied when an execution result failed like the exception was thrown or marked as failed inside your Lambda function. If the exception is failed by Lambda service like access denied, throttled, etc, the regex will not be applied.
One possible solution could be to use the API Gateway as a lambda proxy. Then it actually maps all the service errors to an HTTP-502 response.
The thread in the forum I linked also mentions to use the invocation type event. Maybe that helps.

AWS Lambda: Monitoring lambda timeout that was triggered by SNS.

I have an AWS Lambda that was triggered by SNS message. Many time, it has reached the max duration allowed by AWS, and AWS killed it immediately.
I have to either dig into the Lambda logs or the lambda duration chart to find out about the error.
Are there a better way to report this kind of errors?
Yes, there are some 3rd party tools that help you monitor your environment and provide exactly that - filter on specific errors and drill down to what happened there (the input event, the outgoing HTTP requests etc.).
Moreover, you can also configure alerts on specific errors that you will get via slack/mail.
Disclosure: I work for Lumigo, a company that does exactly that.

AWS Lambda using API Gateway error message

Everything was working yesterday and I'm simply still testing so my capacity shouldn't be high to begin with but I keep receiving these errors today:
{
Message = "We currently do not have sufficient capacity in the region you requested. Our system will be working on provisioning
additional capacity. You can avoid getting this error by temporarily
reducing your request rate.";
Type =Service;
}
What is this error message and should I be concerned that something like this would happen when I go into production? This is a serious error because my users are mandated to login using calls to api gateway (utilizing aws lambda).
This kind of error should not last long as it will immediately trigger AWS provision request.
If you concern about your api gateway availbility, consider to create redundant lambda function on other regions and switch whenever this error occurs. However calling lambda from a remote region can introduce long latency.
Another suggestion is, please review the aws limits for API gateway and Lambda services in your account. If your requests do exceed the limits, raise ticket to aws to extend it.
Amazon API Gateway Limits
Resource Default Limit
Maximum APIs per AWS account 60
Maximum resources per API 300
Maximum labels per API 10
Increase the limits is free service in aws.
Refer: Amazon API Gateway Limits
AWS Lambda posted an event on the service health dashboard, so please follow this for further details on that specific issue.
Unfortunately, if you want to return a custom code when Lambda errors in this way you would have to write a mapping template and attach it to every integration response where you used a Lambda integration.
We recognize that this is suboptimal and is work most customers would prefer API Gateway just handle for them. With that in mind, we already have a high priority item on our backlog to make it easier to pass through the status codes from the Lambda integration. I cannot, however, commit to a timeframe as to when this would be available.