Amazon API gateway timeout - amazon-web-services

I have some issue with API gateway. I made a few API methods, sometimes they work longer than 10 seconds and Amazon returns 504 error. Here is screenshot below:
Please help! How can I increase timeout?
Thanks!

Right now the default limit for Lambda invocation or HTTP integration is 30s according to http://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html and this limit is not configurable.

As of Dec/2017, the maximum value is still 29 seconds, but should be able to customize the timeout value.
https://aws.amazon.com/about-aws/whats-new/2017/11/customize-integration-timeouts-in-amazon-api-gateway/
This can be set in "Integration Request" of each method in APIGateway.

Finally in 2022 we have a workaround. Unfortunately AWS did not change the API Gateway so that's still 29 seconds but, you can use a built-in HTTPS endpoint in the lambda itself: Built-in HTTPS Endpoints for Single-Function Microservices
which is confirmed to have no timeout-so essentially you can have the full 15 minute window of lambda timeout: https://twitter.com/alex_casalboni/status/1511973229740666883
For example this is how you define a function with the http endpoint using aws-cdk and typescript:
const backendApi = new lambda.Function(this, 'backend-api', {
memorySize: 512,
timeout: cdk.Duration.seconds(40),
runtime: lambda.Runtime.NODEJS_16_X,
architecture: Architecture.ARM_64,
handler: 'lambda.handler',
code: lambda.Code.fromAsset(path.join(__dirname, '../dist')),
environment: {
...parsedDotenv
}
})
backendApi.addFunctionUrl({
authType: lambda.FunctionUrlAuthType.NONE,
cors: {
// Allow this to be called from websites on https://example.com.
// Can also be ['*'] to allow all domain.
allowedOrigins: ['*']
}
})

You can't increase the timeout, at least not now. Your endpoints must complete in 10 seconds or less. You need to work on improving the speed of your endpoints.
http://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html

Lambda functions will timeout after a max. of 5 min; API Gateway requests will timeout after 29 sec. You can't change that, but you can workaround it with asynchronous execution pattern, I wrote I blog post about:
https://joarleymoraes.com/serverless-long-running-http-requests/

I wanted to comment on "joarleymoraes" post but don't have enough reputation. The only thing to add to that is that you don't HAVE to refactor to use async, it just depends on your backend and how you can split it up + your client side retries.
If you aren't seeing a high percentage of 504's and you aren't ready for async processing, you can implement client side retries with exponential backoff on them so they aren't permanent failures.
The AWS SDK automatically implements retries with backoff, so it can help to make it easier, especially since Lambda Layers will allow you to maintain the SDK for your functions without having to constantly update your deployment packages.
Once you do that it will result in less visibility into those timeouts, since they are no longer permanent failures. This can buy you some time to deal with the core problem, which is that you are seeing 504's in the first place. That certainly can mean refactoring your code to be more response, splitting up large functions into more "micro service" type concepts and reducing external network calls.
The other benefit to retries is that if you retry all 5xx responses from an application, it can cover a lot of different issues which you might see during normal execution. It is generally considered in all applications that these issues are never 100% avoidable so it's best practice to go ahead and plan for the worst!
All of that being said, you should still work on reducing the lambda execution time or going async. This will allow you to set your timeout values to a much smaller number, which allows you to fail faster. This helps a lot for reducing the impact on the front end, since it doesn't have to wait 29 seconds to retry a failed request.

Timeouts can be decreased but cannot be increased more than 29 seconds. The backend on your method should return a response before 29 seconds else API gateway will throw 504 timeout error.
Alternatively, as suggested in some answers above, you can change the backend to send status code 202 (Accepted) meaning the request has been received successfully and the backend then continues further processing. Of course, we need to consider the use case and it's requirements before implementing the workaround

Lambda functions have 15 mins of max execution time, but since APIGateway has strict 29 second timeout policy, you can do following things to over come this.
For an immediate fix, try increasing your lambda function size. Eg.: If your lambda function has 128 MB memory, you can increase it to 256 MB. More memory helps function to execute faster.
OR
You can use lambdaInvoke() function which is part of the "aws-sdk". With lambdaInvoke() instead of going through APIGateway you can directly call that function. But this is useful on server side only.
OR
The best method to tackle this is -> Make request to APIGateway -> Inside the function push the received data into an SQS Queue -> Immediately return the response -> Have a lambda function ready which triggers when data available in this SQS Queue -> Inside this triggered function do your actual time complex executions -> Save the data to a data store -> If call is comes from client side(browser/mobile app) then implement long-polling to get the final processed result from the same data store.
Now since api is immediately returning the response after pushing data to SQS, your main function execution time will be much less now, and will resolve the APIGateway timeout issue.
There are other methods like using WebSockets, Writing event driven code etc. But above methods are much simpler to implement and manage.

While you cannot increase the timeout, you can link lambda's together if the work is something that could be split up.
Using the aws sdk:
var aws = require('aws-sdk');
var lambda = new aws.Lambda({
region: 'us-west-2' //change to your region
});
lambda.invoke({
FunctionName: 'name_of_your_lambda_function',
Payload: JSON.stringify(event, null, 2) // pass params
}, function(error, data) {
if (error) {
context.done('error', error);
}
if(data.Payload){
context.succeed(data.Payload)
}
});
Source: Can an AWS Lambda function call another
AWS Documentation: http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Lambda.html

As of May 21, 2021 This is still the same. The hard limit for the maximum time is 30 seconds. Below is the official document on quotas for API gateway.
https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html#http-api-quotas

The timeout limits cannot be increased so a response should be returned within 30 seconds. The workaround I usually do :
Send the result in an async way. The lambda function should trigger
another process and sends a response to the client saying
Successfully started process X and the other process should notify
the client in async way once it finishes (Hit an endpoint, Send a
slack notification or an email..). You can found a lot of interesting resources concerning this topic
Utilize the full potential of the multiprocessing in your lambda
function and increase the memory for a faster computing time
Eventually, if you need to return the result in a sync way and one
lambda function cannot do the job, you could integrate API gateway
directly with step function so you would have multiple lambda
function working in parallel. It may seem complicated but in fact it
is quite simple

Custom timeout between 50 and 29,000 milliseconds for WebSocket APIs and between 50 and 30,000 milliseconds for HTTP APIs. The default timeout is 29 seconds for WebSocket APIs and 30 seconds for HTTP APIs

Related

AWS API Gateway Slowness

This has me perturbed. I have a basic API gateway that is supposed to be capped at 10,000 requests per second with 5,000 request bursts. However, when hooked up to Lambdas, best I can hit currently is ~70 requests / second.
The end-points I have are basic Lambda proxies created with Serverless framework (HTTP EDGE).
I know that the lambda itself is not the bottleneck as I have the same issue when I replace the lambda with an empty function. I have 100+ allocated concurrency for the lambda, but the lambda never appears to hit the limit.
functions:
loadtest:
handler: loadtest/index.handler
reservedConcurrency: 200
events:
- http: POST load_test
I'm wondering if there's something that I am overlooking here. My test runs for a minute and attempts to hit 200 req / sec (works fine with other so it's not my bandwidth). The delays grow to be as much as 20-30s at some point, so clearly something is choking up.
If it's a warm up issue - how long am I expected to run such load until everything is running warm?
Any ideas on where to look or additional information that I could share?
[Edit] I am using node12.x and I even tried with this code:
const AWS = require('aws-sdk');
AWS.config.update({region: '<my-region>'});
var sqs = new AWS.SQS({apiVersion: '2012-11-05'});
exports.handler = async (event, context) => {
return {"status":"ok", ... }
};
The results were basically the same. I'm not sure where the bottle neck is, to be honest. I can try further testing with concurrency on the lambda side, but going from 100 to 200 had no effect - the completed requests clocks at around 70/s for an empty function.
Also, I'm using loadtest npm package to perform the loadtest and this is what the output looks like:
{ totalRequests: 8200,
totalErrors: 0,
totalTimeSeconds: 120.00341689999999,
rps: 68,
meanLatencyMs: 39080.6,
maxLatencyMs: 78490,
minLatencyMs: 427,
percentiles: { '50': 38327, '90': 70684, '95': 74569, '99': 77679 },
errorCodes: {},
instanceIndex: 0 }
Here's a picture of how provisioned concurrency looked like over that period of time. I ran this over 2 minutes with the target at 200 req/sec.
Appears that this is actually an issue with WSL2 and NodeJS. The exact nature of it is still unclear, but it is not an issue with the API gateway itself. I demonstrated this by running it on MacBook and everything worked fine and request counts were high.
There are posts suggesting that this is an issue with the Node HTTP client & DNS, so perhaps that's a good starting point, but the above question is moot.

Rate Exceeded on AWS Lambda Using API Gateway and serverless framework

When I try to invoke a method that has a HTTP event it results in 500 Internal server error.
On CloudWatch logs it shows Recoverable error occurred (Rate Exceeded.)
When I try invoke a function without lambda it executes with response.
Here is my serverless config:
You have set your Lambda's reservedConcurrency to 0. This will prevent your Lambda from ever being invoked. Setting it to 0 is usually useful when your functions are getting invoked but you're not sure why and you want to stop it right away.
If you want to have it invoked, change reservedConcurrency to a positive integer (by default, it can be a positive integer <= 1000, but you can increase this limit by contacting AWS) or simply remove the reservedConcurrency attribute from your .yml file as it will use the default values.
Why would one ever use reservedConcurrency anyways? Well, let's say your Lambda functions are triggered by requests from API Gateway. Let's say you get 400 (peak hours) requests/second and, upon every request, two other Lambda functions are triggered, one to generate a thumbnail for a given image and one to insert some metadata in DynamoDB. You'd have, in theory, 1200 Lambda functions running at the same time (given all of your Lambda functions finish their execution in less than a second). This would lead to throttling as the default concurrent execution for Lambda functions is 1000. But is the thumbnail generation as important as the requests coming from API Gateway? Very likely not as it's naturally an eventually consistent task, so you could set reservedConcurrency on the thumbnail Lambda to only 200, so you wouldn't use up your concurrency, meaning other functions would be able to spin up to do something more useful at a given point in time (in our example, receiving HTTP requests is more important than generating thumbnails). The other 800 left concurrency could then be split between the function triggered from API Gateway and the one that inserts data into DynamoDB, thus preventing throttling for the important stuff and keeping the not-so-important-stuff eventually consistent.

How can I keep warm an AWS Lambda invoked from API Gateway with proxy integration

I have defined a lambda function that is invoked from API Gateway with proxy integration. Thus, I have defined an eager resource path for it:
And referenced my lambda function:
My lambda is able to process request like GET /myresource, POST /myresource.
I have tried this strategy to keep it warm, described in acloudguru. It consists of setting up a CloudWatch event rule that invokes the lambda every 5 minutes to keep it warm. Unfortunately it isn't working.
This is the behaviour I have seen:
After some period, let's say 20 minutes, I call GET /myresource from API Gateway and it takes around 15 seconds. Subsequent requests last ~30ms. The CloudWatch event is making no difference...
Let's suppose another long period without calling the gateway. If I go to the Lambda console and invoke it directly (test button) it answers right away (less than 1ms) with a 404 (that's normal because my lambda expects GET /myresource or POST /myresource).
Immediately after this lambda console execution I call GET /myresource from API Gateway and it still takes ~20 seconds. That is to say, the function was still cold despite having being invoked from the Lambda console. This might explain why the CloudWatch event doesn't work since it calls the lambda without setting the method/resource-url.
So, how can I make this particular case with API Gateway with proxy integration + Lambda stay warm to prevent those slow first request?
As of now (2019-02-27) [1], A periodic CloudWatch event rule does not deterministically solve the cold start issue. But a periodic CloudWatch event rule will reduce the probability of cold starts.
The reason is it's upto the Lambda server to decide whether to use a new Lambda container instead of an existing container to process an incoming request. Some of the related details regarding how Lambda containers are reused is explained in [1]
In order to reduce the cold start time (not to reduce the number cold starts), can you try followings? 1. increasing the memory allocated to the function, 2. reduce the deployment package size (eg- remove unnecessary dependencies), and 3. use a language like NodeJS, Python instead of Java, .Net
[1]According to reinvent session, (39:50 at https://www.youtube.com/watch?v=QdzV04T_kec), the Lambda team expects to improve the VPC cold start latency in Lambda.
[2] https://aws.amazon.com/blogs/compute/container-reuse-in-lambda/
Denis is quite right about the non deterministic lambda behaviour regarding the number of containers hit by CloudWatch events. I'll follow his advice to improve the startup time.
On the other hand I have managed to make my CloudWatch events hit the lambda function properly, reducing (in many cases) the number of cold starts.
I just had to add an additional controller mapped to "/" with a hardcoded response:
#Controller("/")
class WarmUpController {
private val logger = LoggerFactory.getLogger(javaClass)
#Get
fun warmUp(): String {
logger.info("Warming up")
return """{"message" : "warming up"}"""
}
}
With this in place the default (/) invocation from CloudWatch does keep the container warm most of the time.

Catching timeout errors in AWS Api Gateway

Since Api Gateway time limit is 10 seconds to execute any request I'm trying to deal with timeout errors, but a haven't found a way to catch and respond a custom message.
Context of the problem: I have a function that takes less than 2 seconds to execute, but when the function performs a cold start sometimes it takes more than 10 seconds creating a connection with DynamoDB in Java. I've already optimize my function using threads but I still cannot keep between the 10-seconds limit for the initial call.
I need to find a way to deliver a response model like this:
{
"error": "timeout"
}
To find a solution I created a function in Lambda that intentionally responds something after 10 seconds of execution. Doing the integration with Api Gateway I'm getting this response:
Request: /example/lazy
Status:
Latency: ms
Response Body
{
"logref": "********-****-****-****-1d49e75b73de",
"message": "Timeout waiting for endpoint response"
}
In documentation I found that you can catch this errors using HTTP status regex in Integration Response. But I haven't find a way to do so, and it seems that nobody on the Internet is having my same problem, as I haven't find this specific message in any forum.
I have tried with these regex:
.*"message".*
Timeout.*
.*"status":400.*
.*"status":404.*
.*"status":504.*
.*"status":500.*
Anybody knows witch regex I should use to capture this "message": "Timeout... ?
You are using Test Invoke feature from console which has a timeout limit of 10 seconds. But, the deployed API's timeout is 30 seconds as mentioned here. So, that should be good enough to handle Lambda cold start case. Please deploy and then test using the api link. If that times out because your endpoint takes more than 30 seconds, the response would be:
{"message": "Endpoint request timed out"}
To clarify, you can configure your method response based on the HTTP status code of integration response. But in case of timeout, there is no integration response. So, you cannot use that feature to configure the method response during timeout.
You can improve the cold start time by allocating more memory to your Lambda function. With the default 512MB, I am seeing cold start times of 8-9 seconds for functions written in Java. This improves to 2-3 seconds with 1536MB of memory.
Amazon says that it is the CPU allocation that is really important, but there is not way to directly increase it. CPU allocation increases proportionately to memory.
And if you want close to zero cold start times, keeping the function warm is the way to go, as described here.

Is it possible to make an HTTP request from one Lambda function, and handle the response in another?

AWS Lambda functions are supposed to respond quickly to events. I would like to create a function that fires off a quick request to a slow API, and then terminates without waiting for a response. Later, when a response comes back, I would like a different Lambda function to handle the response. I know this sounds kind of crazy, when you think about what AWS would have to do to hang on to an open connection from one Lambda function and then send the response to another, but this seems to be very much in the spirit of how Lambda was designed to be used.
Ideas:
Send messages to an SQS queue that represent a request to be made. Have some kind of message/HTTP proxy type service on an EC2 / EB cluster listen to the queue and actually make the HTTP requests. It would put response objects on another queue, tagged to identify the associated request, if necessary. This feels like a lot of complexity for something that would be trivial for a traditional service.
Just live with it. Lambda functions are allowed to run for 60 seconds, and these API calls that I make don't generally take longer than 10 seconds. Not sure how costly it would to have LFs spend 95% of their running time waiting on a response, but "waiting" isn't what LFs are for.
Don't use Lambda for anything that interacts with 3rd party APIs that aren't lightning fast :( That is what most of my projects do these days, though.
It depends how many calls will this lambda execute monthly, and how many memory are you allocating for those lambda. The new timeout for lambda is 5 minutes, which should (hopefully :p) be more than enough for an API to respond. I think you should let lambda deal with all of it to not over complicate the workflow. Lambda pricing is generally really cheap.
E.g: a lambda executed 1 million times with 128 MB allocated during 10 seconds would cost approximatively 20$ - this without considering the potential free tier.