how to reduce AWS Api gateway response time - amazon-web-services

response from an api from aws api gateway integrated to a lambda
function takes a lot more time than compared to a regular node project
on aws elasticbeanstalk
is there any way to reduce response time for aws api gateway

There's definitely more information needed to answer this question but from what you've said, your problems may be caused by the cold start time of Lambda functions. An Elastic Beanstalk stack will spin up EC2 instances (which are ready once they're spun up and until they're removed). Lambda will create instances of your handler as needed to address incoming traffic. The first time you call a Lambda, it needs to provision an environment for the function for the first time. Depending on the language used, this can take some time. Successive requests should be faster unless you wait a while (in which case the lambda needs to re-initialize).
So here's more information that would be useful in case this answer is not helpful:
How much slower is Lambda than your Elastic Beanstalk stack?
Is it slower only on the first couple requests or does it continue being slow when you keep requesting?
Is it slow every day or only occasionally?

Related

lambda and fargate errors/timeouts

i have a python api that i have tried on vms, fargate, and lambda.
vms - less errors when capacity is large enough
fargate - second less errors when capacity is large enough, but when autoscaling, i get some 500 errors. looks like it doesn't autoscale quick enough.
lambda - less consistent. when there are a lot of api calls, less errors. but from cold start, it may periodically fail. i do not pre-provision. when i do, i get less errors too.
i read on the below post, cold start for lambda is less than 1 sec? seems like it's more. one caveat is that each lambda function will check for an existing "env" file. if it does not exist, it will download from s3. however this is done only when hitting the api. the lambda function is listening and responding. when you hit api, the lambda function will respond and connect, download the .env file, and process further the api call. fargate also does the same, but less errors again. any thoughts?
i can pre-provision, but it gets kind of expensive. at that point, i might as will go back to VMs with autoscaling groups, but it's less cloud native. the vms provide the fastest response by far and harder to manage.
Can AWS Lambda coldout cause API Gateway timeout(30s)?
i'm using an ALB in front of lambda and fargate. the vms simply use round robin dns.
questions:
am i doing something wrong with fargate or lambda? are they alright for apis or should i just go back to vms?
what or who maintains api connection while lambda is starting up from a cold start? can i have it retry or hold on to the connection longer?
thanks!
am i doing something wrong with fargate or lambda? are they alright for apis or should i just go back to vms?
The one thing that strikes me is downloading env from s3. Wouldn't it be easier and faster to keep your env data in SSM Parameter Store? Or perhaps, passing them as env variables to the lambda function itself.
what or who maintains api connection while lambda is starting up from a cold start? can i have it retry or hold on to the connection longer?
API gateway. Sadly you can't extend 30 s time limit. Its hard limit.
i'm using an ALB in front of lambda and fargate.
It seems to me that you have API gateway->ALB->Lambda function. Why would you need ALB in that? Usually there is no such need.
i can pre-provision, but it gets kind of expensive.
Sadly, this is the only way to minimize cold-starts.

AWS Serverless: Force parallel lambda execution based on request or HTTP API parameters

Is there a way to force AWS to execute a Lambda request coming from an API Gateway resource in a certain execution environment? We're in a use-case where we use one codebase with various models that are 100-300mb, so on their own small enough to fit in the ephemeral storage, but too big to play well together.
Currently, a second invocation with a different model will use the existing (warmed up) lambda function, and run out of storage.
I'm hoping to attach something like a parameter to the request that forces lambda to create parallel versions of the same function for each of the models, so that we don't run over the 512 MB limit and optimize the cold-boot times, ideally without duplicating the function and having to maintain the function in multiple places.
I've tried to investigate Step Machines but I'm not sure if there's an option for parameter-based conditionality there. AWS are suggesting to use EFS to circumvent the ephemeral storage limits, but from what I can find, using EFS will be a lot slower than reading from the ephemeral /tmp/ directory.
To my knowledge: no. You cannot control the execution environments. Only thing you can do is limit the concurrent executions.
So you never know, if it is a single Lambda serving all your events triggered from API Gateway or several running in parallel. You also have no control over which one of the execution environments is serving the next request.
If your issues is the /temp directory limit for AWS Lambda, why not try EFS?

AWS Lambda inside VPC. 504 Gateway Timeout (ENI?)

I have a Serverless .net core web api lambda application deployed on AWS.
I have this sitting inside a VPC as I access ElasticSearch service inside that same VPC.
I have two API microservices that connect to the Elasticsearch service.
After a period of non use (4 hours, 6 hours, 18 hours - I'm not sure exactly but seems random), the function becomes unresponsive and I get a 504 gateway timeout error, "Endpoint cannot be found"
I read somewhere that if "idle" for too long, the ENI is released back into the AWS system and that triggering the Lambda again should start it up.
I can't seem to "wake" up the function by calling it as it keeps timing out with the above error (I have also increased the timeouts from default).
Here's the kicker - If I make any changes to the specific lambda function, and save those changes (this includes something as simple as changing the timeout value) - My API calls (BOTH of them, even though different lambdas) start working again like it has "kicked" back in. Obviously the changes do this, but why?
Obviously I don't want timeouts in a production environment regardless of how much, OR how little the lambda or API call is used.
I need a bulletproof solution to this. Surely it's a config issue of some description but I'm just not sure where to look.
I have altered Route tables, public/private subnets, CIDR blocks, created internet gateways, NAT etc. for the VPC. This all works, but these two lambdas, that require VPC access, keeps falling "asleep" somehow.
The is because of Cold Start of Lambda.
There is a new feature which was release in reInvent 2019, where in there is a provisioned concurrency for lambda (don't get confused with reserved concurrency).
Ensure the provisioned concurrency to minimum 1 (or the amount of requests to be served in parallel) to have lambda warm always and serve requests
Ref: https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/
To get more context, Lambda in VPC uses hyperplane ENI and functions in the same account that share the same security group:subnet pairing use the same network interfaces.
If Lambda functions in an account go idle for sometime (typically no usage for 40 mins across all functions using that ENI, as I got this time info from AWS support), the service will reclaim the unused Hyperplane resources and so very infrequently invoked functions may still see longer cold-start times.
Ref: https://aws.amazon.com/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/

AWS EC2 vs Serverless Cost Comparison

I am currently using AWS EC2 for my workloads.
Now I want to convert the EC2 server to the Serverless Platform i.e(API Gateway and Lambda).
I have also followed different blogs and I am ready to go with the serverless. But, my one concern is on pricing.
How can I predict per month cost for the serverless according to my use of EC2? Will the EC2 Cloudwatch metrics help me to calculate the costing?
How can I make cost comparison?
Firstly, there is no simple answer to your question as a simple lift and shift from a VM to Lambda is not ideal. To make the most of lambda, you need to architect your solution to be serverless. This means making use of the event-driven nature of Lambda.
Now to answer the question simply, you are charged only for the time it takes to serve a request (to the next 100ms). So if your lambda responds to the request in 70ms you pay for 100ms of execution time. If you serve the request in 210ms then you pay for 300ms.
You also pay for the memory allocated to the function on the order of GB per/month.
If you have a good logging or monitoring strategy you could check how long it takes to serve each type of request and how often they occur. If your application is fairly low-scale and is not accessed often (say all requests come within an 8 hour window) then you may end up saving money with lambda as you are only paying AWS for the time spent serving the request.
Also, it may help to read the following article on common pitfalls:
https://medium.com/#emaildelivery/serverless-pitfalls-issues-you-may-encounter-running-a-start-up-on-aws-lambda-f242b404f41c

Is AWS Lambda good for real-time API Rest?

I'm learning about AWS Lambda and I'm worried about synchronized real-time requests.
The fact the lambda has a "cold start" it doesn't sounds good for handling GET petitions.
Imagine a user is using the application and do a GET HTTP Request to get a Product or a list of Products, if the lambda is sleeping, then it will take 10 seconds to respond, I don't see this as an acceptable response time.
Is it good or bad practice to use AWS Lambda for classic (sync responses) API Rest?
Like most things, I think you should measure before deciding. A lot of AWS customers use Lambda as the back-end for their webapps quite successfully.
There's a lot of discussion out there on Lambda latency, for example:
2017-04 comparing Lambda performance using Node.js, Java, C# or Python
2018-03 Lambda call latency
2019-09 improved VPC networking for AWS Lambda
2019-10 you're thinking about cold starts all wrong
In December 2019, AWS Lambda introduced Provisioned Concurrency, which improves things. See:
2019-12 AWS Lambda announces Provisioned Concurrency
2020-09 AWS Lambda Cold Starts: Solving the Problem
You should measure latency for an environment that's representative of your app and its use.
A few things that are important factors related to request latency:
cold starts => higher latency
request patterns are important factors in cold starts
if you need to deploy in VPC (attachment of ENI => higher cold start latency)
using CloudFront --> API Gateway --> Lambda (more layers => higher latency)
choice of programming language (Java likely highest cold-start latency, Go lowest)
size of Lambda environment (more RAM => more CPU => faster)
Lambda account and concurrency limits
pre-warming strategy
Update 2019-12: see Predictable start-up times with Provisioned Concurrency.
Update 2021-08: see Increasing performance of Java AWS Lambda functions using tiered compilation.
As an AWS Lambda + API Gateway user (with Serverless Framework) I had to deal with this too.
The problem I faced:
Few requests per day per lambda (not enough to keep lambdas warm)
Time critical application (the user is on the phone, waiting for text-to-speech to answer)
How I worked around that:
The idea was to find a way to call the critical lambdas often enough that they don't get cold.
If you use the Serverless Framework, you can use the serverless-plugin-warmup plugin that does exactly that.
If not, you can copy it's behavior by creating a worker that will invoke the lambdas every few minutes to keep them warm. To do this, create a lambda that will invoke your other lambdas and schedule CloudWatch to trigger it every 5 minutes or so. Make sure to call your to-keep-warm lambdas with a custom event.source so you can exit them early without running any actual business code by putting the following code at the very beginning of the function:
if (event.source === 'just-keeping-warm) {
console.log('WarmUP - Lambda is warm!');
return callback(null, 'Lambda is warm!');
}
Depending on the number of lamdas you have to keep warm, this can be a lot of "warming" calls. AWS offers 1.000.000 free lambda calls every month though.
We have used AWS Lambda quite successfully with reasonable and acceptable response times. (REST/JSON based API + AWS Lambda + Dynamo DB Access).
The latency that we measured always had the least amount of time spent in invoking functions and large amount of time in application logic.
There are warm up techniques as mentioned in the above posts.