Service Unavailable Error from APIGATEWAY - amazon-web-services

I am using multiple lambdas in the step functions. The client hits the API gateway, which invokes a lambda, and that lambda invokes the step function's lambdas to get the response.
The response time is 3-7 seconds when lambdas are already warmed up.
But when it has cold-start, it takes more than 30 seconds for some requests based on input.
As the API gateway has a timeout of 30 seconds, I get a “503- service Unavailable error" for some requests request though the execution is running. Though AWS says, it should give 504 for request time out.
Then I used provisioned concurrency for the latency issue, which will likely solve the timeout issue. But still, sometimes, I get a service Unavailable error.
I checked cloudwatch trace. Interestingly, Each lambda takes an initialization time of 3 seconds even after provisioning. Now I doubt whether provisioned-concurrency is working or I do not understand it.
Any advice?

Related

Resolve Performance Issues with NodeJS AWS Lambda API

I am new to AWS and having some difficulty tracking down and resolving some latency we are seeing on our API. Looking for some help diagnosing and resolving the issue.
Here is what we are seeing:
If an endpoint hasn't been hit recently, then on the first request we see a 1.5-2s delay marked as "Initialization" in the CloudWatch Trace.
I do not believe this is a cold start, because each endpoint is configured to have 1 provisioned concurrency, so we should not get a cold start unless there are 2 simultaneous requests. Also, the billed duration includes this initialization period.
Cold start means when your first request hit to aws lambda it will be prepared container to run your scripts,this will take some time and your request will delay.
When second request hit lambda and lambda and container is already up and runing will be process quickly
https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/
This is the default behavior of cold start, but since you said that you were using provisioned concurrency, that shouldn't happen.
Provisioned concurrency has a delay to activate in the account, you can follow this steps to verify if this lambda used on demand or provisioned concurrency.
AWS Lambda sets an ENV called AWS_LAMBDA_INITIALIZATION_TYPE that contain the values on_demand or provisioned_concurrency.

API Gateway V2 slowness on initial request

I've got an API Gateway V2 (Protocol:HTTP) style endpoint, it simply makes a request to my Lambda function and gives me the response. I've noticed that if I make no request for about 10 minutes or so, that on a new request it's going WAY slower than the requests afterwards. It's the same function that does the same thing every time so I'm not sure why this is happening, has anyone else ever had this and found a solution?
The reason for this is that your Lambda function has to be started before it can handle requests.
This is also called cold start.
Starting a new instance of your Lambda does take some time. Once it is started it will serve several requests. At some point in time the AWS Lambda service is going to shutdown your Lambda function. For example when there has not been any traffic for a while.
That's where your observation is coming from:
I've noticed that if I make no request for about 10 minutes or so, that on a new request it's going WAY slower than the requests afterwards.
When there are no instances of your Lambda running and a new requests is coming in, the AWS Lambda service needs to instantiate a "fresh" instance of your Lambda.
You could read this blog which touches this topic:
https://aws.amazon.com/blogs/compute/operating-lambda-performance-optimization-part-1/

API Gateway with Lambda returns 404 on first call when left idle?

I've built an API with API gateway and Lambda. I've noticed that when left idle (usually a few hours), it will fail on the first call. Has anyone else encountered this issue?
Should I implement retries on my API calls or is there some configuration for Lambda that I am missing out on?
[INFO] 2019-04-15T03:18:58.263Z SUCCESS: Connection to RDS MySQL instance succeeded
This is the only line that was logged in CloudWatch for my Lambda function.
I have found out that AWS Lambda will take longer than usual to invoke a function if left idle due to cold starts.
The error that I have received was due to the Lambda taking longer than my defined timeout for http requests to return a response.
I've removed VPC from my Lambda function as suggested to lower the cold start time and I have not experience any cold start issue with Lambda since.

Warming Lambda Function with Cloudwatch schedule rules

I'm trying to warm a lambda (inside VPC which access a private RDS) function with cloudwatch. The rate is 5 minutes (only for experimental) I intend to make it 35 minutes later on.
After I saw the cloudwatch logs which indicate that the function has been called (and completed which I have set up a condition if no input was given, return an API gateway response immediately), I call the function from API gateway URL.
However, I'm still getting that cold starts which it return a response in 2sec. If I do it again, I get the response in 200ms.
So my question is:
What did I do wrong? can I actually warm a lambda function with a cloudwatch schedule?
Is dropping the request immediately affects this behaviour? the db connection is not established if the request is from cloudwatch
Thanks!
****EDIT****
I tried to connect to the db before I drop the function when it's called by cloudwatch. But it doesn't change anything. The first request through API call still around 2s and the next ones are around 200ms.
****EDIT 2****
I tried to remove the schedule entirely and the cold start achieves 9s. So I guess the 2s has discounted the cold start. Is it possible that the problem lies in other services? Such as API Gateway?

Could lambda return throttling error even if it doesn't reach maximum concurrency number(1000)

Here is the situation I suffer.
Lambda has 1000 concurrency limit. (There is no reserved number)
100-200 Clients access to Lambda at a time.
Lambda still doesn't reach throttling in the figures(100-200).
However, Lambda returns a lot of 502 errors.
And I assume
The first time, any Lambda isn't up.
When Lambda receives a lot of requests, it starts scaling.
However, because of Lambda's cold start time, It takes time to execute enough concurrency to handle all requests and as the result, it returns the error(even if it is not reached to maximum concurrency execution number[1000])
Is my assumption correct? If so, is it inevitable situation?
I have read on warming Lambda by sending ping requests at regular interval to the Lambda.
However, It looks not to solve the above issue because the ping is sent to only one Lambda making only the Lambda is resued, causing the same issue when a lot of requests is received at a time.
------Edit------
About #M Mo's asking
How are the lambdas being invoked?
By API-getaway it is invoked.
If through api gateway, are you using proxy integration?
Yes, proxy integration.
Do the lambdas call any other resources?
Yes, The Lambda calls S3 resource to get objects.
What are the average response times of the lambdas?
it takes about 1 sec.