I have an API Gateway attached to the Lambda,
Lambda is throwing the Response within 300ms on an Average.
But at API Gateway Level,
Integration Latency sometimes is taking more time.
I can say this by Capturing the Integration Latency in CloudWatch Logs of API Gateway.
More Time i.e more than 29 seconds at times and throwing the Gateway TimeOut Exception.
It is causing the total Latency to be more.
Is there a way we can reduce the Integration Latency at API Gateway Level.
Or Should We live with 504 Timeout Errors at times.
There are a few things you could consider doing:
move from REST api gateway to HTTP api gateway if it's not done already. It has less features, but if you are ok with it then you will gain a substantial speed from it, along with lower costs and maintenance simplicity.
Use provisioned provisionned concurrency to avoid cold starts (comes with increased costs)
I would advise using HTTP api if you are only using lambdas. It's more recent an frankly I think it's a lot easier to understand and use than REST api
Related
I am trying to use API Gateway as my API interface between Frontend and Lambda functions. Since API gateway has the maximum timout for 30 seconds and lambda take much time to do the computation, can we use the API Gateway Web socket to make this possible?
I currently am creating RESTful API's on API Gateway and found out about the Web sockets on API Gateway.
Do anyone has suggestions on how to make this possible?
Depending on what your Lambda function is doing, it may be worthwhile to increase the Lambda Memory configuration. Copied from the Lambda Developer Guide emphasis mine:
Memory – The amount of memory available to the function during execution. [...] Lambda allocates CPU power linearly in proportion to the amount of memory configured.
Thus, by increasing the amount of Lambda memory, you also increase the Lambda CPU performance. For computational intensive operations this configuration can significantly decrease the response time, hopefully less than the API Gateway 30s timeout.
My lambda function works less than 1300 MS when I click the test button on the Lambda Page. (Lambda Page: https://eu-central-1.console.aws.amazon.com/lambda/home?region=eu-central-1#/functions/myfunc?tab=graph)
When I send a request to Lambda via API Gateway then I have to wait for 4300 ms.
The HTTP requests, which goes to Lambda via Gateway work 3-4 times slow.
I saw some similar forum posts. However, I couldn't find a solution for this issue.
How can I reduce the latency?
API Gateway is known to introduce a lot of latency.
Lambda is not ideal for synchronous requests + responses as you seem to be using it. It's more suited to asynchronous processes where the latency between invocation and execution are not as critical.
You should probably think about whether your system needs to be synchronous, and if it needs to be synchronous whether lambda is the best answer.
we would like to create serverless architecture for our startup and we would like to support up to 1 million requests per second and 50 millions active users. How can we handle this use case with AWS architecture?
Regarding to AWS documentation API Gateway can handle only 10K requests/s and lamda can process 1K invocations/s and for us this is unacceptable.
How can we overcome this limitation? Can we request this throughput with AWS support or can we connect somehow to another AWS services (queues)?
Thanks!
Those numbers you quoted are the default account limits. Lambda and API Gateway can handle more than that, but you have to send a request to Amazon to raise your account limits. If you are truly going to receive 1 million API requests per second then you should discuss it with an AWS account rep. Are you sure most of those requests won't be handled by a cache like CloudFront?
The gateway is NOT your API Server. Lambda's are the bottleneck.
While the gateway can handle 100000 messages/sec (because it is going through a message queue), Lambdas top out at around 2,200 rps even with scaling (https://amido.com/blog/azure-functions-vs-aws-lambda-vs-google-cloud-functions-javascript-scaling-face-off/)
This differs dramatically from actually API framework implementations wherein the scale goes up to 3,500+ rps...
I think you should go with Application Load Balancer.
It is limitless in terms of RPS and can potentially be even cheaper for a large number of requests. It does have fewer integrations with AWS services though, but in general, it has everything you need for a gateway.
https://dashbird.io/blog/aws-api-gateway-vs-application-load-balancer/
Has anyone found a solution to API Gateway latency issues?
With a simple function testing API Gateway -> Lambda interaction, I regularly see cold starts in the 2.5s range, and once "warmed," response times in the 900ms - 1.1s range are typical.
I understand the TLS handshake has its own overhead, but testing similar resources (AWS-based or general sites that I believe are not geo-distributed) from my location shows results that are half that, ~500ms.
Is good news coming soon from AWS?
(I've read everything I could find before posting.)
engineer with the API Gateway team here.
You said you've read "everything", but for context for others I want to link to a number of threads on our forums where I've documented publicly where a lot of this perceived latency when executing a single API call comes from:
Forum Post 1
Forum Post 2
In general, as you increase your call rates, your average latency will shrink as connection reuse mechanisms between your clients and CloudFront as well as between CloudFront and API Gateway can be leveraged. Additionally, a higher call rate will ensure your Lambda is "warm" and ready to serve requests.
That being said, we are painfully aware that we are not meeting the performance bar for a lot of our customers and are making strides towards improving this:
The Lambda team is constantly working on improving cold start times as well as attempting to remove them for functions that are seeing continuous load.
On API Gateway, we are currently in the process of rolling out improved connection reuse between CloudFront and API Gateway, where customers will be able to benefit from connections established via other APIs. This should mean that the percentage of requests that need to do a full TLS handshake between CloudFront and API Gateway should be reduced.
I'm trying to figure out where the latency in my calls is coming from, please let me know if any of this information could be presented in a format that is more clear!
Some background: I have two systems--System A and System B. I manually (through Postman) hit an endpoint on System A that invokes an endpoint on System B.
System A is hosted on an EC2 instance.
When System B is hosted on a Lambda function behind API Gateway, the
latency for the call is 125 ms.
When System B is hosted on an
EC2 instance, the latency for the call is 8 ms.
When System B is
hosted on an EC2 instance behind API Gateway, the latency for the
call is 100 ms.
So, my hypothesis is that API Gateway is the reason for increased latency when it's paired with the Lambda function as well. Can anyone confirm if this is the case, and if so, what is API Gateway doing that increases the latency so much? Is there any way around it? Thank you!
It might not be exactly what the original question asks for, but I'll add a comment about CloudFront.
In my experience, both CloudFront and API Gateway will add at least 100 ms each for every HTTPS request on average - maybe even more.
This is due to the fact that in order to secure your API call, API Gateway enforces SSL in all of its components. This means that if you are using SSL on your backend, that your first API call will have to negotiate 3 SSL handshakes:
Client to CloudFront
CloudFront to API Gateway
API Gateway to your backend
It is not uncommon for these handshakes to take over 100 milliseconds, meaning that a single request to an inactive API could see over 300 milliseconds of additional overhead. Both CloudFront and API Gateway attempt to reuse connections, so over a large number of requests you’d expect to see that the overhead for each call would approach only the cost of the initial SSL handshake. Unfortunately, if you’re testing from a web browser and making a single call against an API not yet in production, you will likely not see this.
In the same discussion, it was eventually clarified what the "large number of requests" should be to actually see that connection reuse:
Additionally, when I meant large, I should have been slightly more precise in scale. 1000 requests from a single source may not see significant reuse, but APIs that are seeing that many per second from multiple sources would definitely expect to see the results I mentioned.
...
Unfortunately, while cannot give you an exact number, you will not see any significant connection reuse until you approach closer to 100 requests per second.
Bear in mind that this is a thread from mid-late 2016, and there should be some improvements already in place. But in my own experience, this overhead is still present and performing a loadtest on a simple API with 2000 rps is still giving me >200 ms extra latency as of 2018.
source: https://forums.aws.amazon.com/thread.jspa?messageID=737224
Heard from Amazon support on this:
With API Gateway it requires going from the client to API Gateway,
which means leaving the VPC and going out to the internet, then back
to your VPC to go to your other EC2 Instance, then back to API
Gateway, which means leaving your VPC again and then back to your
first EC2 instance.
So this additional latency is expected. The only way to lower the
latency is to add in API Caching which is only going to be useful is
if the content you are requesting is going to be static and not
updating constantly. You will still see the longer latency when the
item is removed from cache and needs to be fetched from the System,
but it will lower most calls.
So I guess the latency is normal, which is unfortunate, but hopefully not something we'll have to deal with constantly moving forward.
In the direct case (#2) are you using SSL? 8 ms is very fast for SSL, although if it's within an AZ I suppose it's possible. If you aren't using SSL there, then using APIGW will introduce a secure TLS connection between the client and CloudFront which of course has a latency penalty. But usually that's worth it for a secure connection since the latency is only on the initial establishment.
Once a connection is established all the way through, or when the API has moderate, sustained volume, I'd expect the average latency with APIGW to drop significantly. You'll still see the ~100 ms latency when establishing a new connection though.
Unfortunately the use case you're describing (EC2 -> APIGW -> EC2) isn't great right now. Since APIGW is behind CloudFront, it is optimized for clients all over the world, but you will see additional latency when the client is on EC2.
Edit:
And the reason why you only see a small penalty when adding Lambda is that APIGW already has lots of established connections to Lambda, since it's a single endpoint with a handful of IPs. The actual overhead (not connection related) in APIGW should be similar to Lambda overhead.