We need to have an endpoint to receive http POST requests and send them to SQS with both headers and payload. API Gateway with REGIONAL type and SQS integration works great and satisfies our needs. However, we are slightly worried about the limits of 600 requests per second as it might not be enough for our case. Do we correctly understand that API Gateway HTTP API (that is not REST API with REGIONAL or EDGE types) can receive 10.000 requests per second, but in this case we would need to "build" our own integration to SQS (e.g. using lambdas)?
A bit late, but I believe there is a quota of 600 regional APIs (not request rate) per region. That mean, you can create 600 APIs, each of them with up to 10k requests per second. The quota 10k requests per second is, however, shared across these APIs, so if you have 100 APIs, each of them can receive can, in average, receive 1k requests per second. However, if 99 of them sit idle, the one API can get 10k requests per second.
Related
I have an API Gateway attached to the Lambda,
Lambda is throwing the Response within 300ms on an Average.
But at API Gateway Level,
Integration Latency sometimes is taking more time.
I can say this by Capturing the Integration Latency in CloudWatch Logs of API Gateway.
More Time i.e more than 29 seconds at times and throwing the Gateway TimeOut Exception.
It is causing the total Latency to be more.
Is there a way we can reduce the Integration Latency at API Gateway Level.
Or Should We live with 504 Timeout Errors at times.
There are a few things you could consider doing:
move from REST api gateway to HTTP api gateway if it's not done already. It has less features, but if you are ok with it then you will gain a substantial speed from it, along with lower costs and maintenance simplicity.
Use provisioned provisionned concurrency to avoid cold starts (comes with increased costs)
I would advise using HTTP api if you are only using lambdas. It's more recent an frankly I think it's a lot easier to understand and use than REST api
In most of the official documents to express throttling limits, AWS uses metrics like Requests per second or Requests per client. e.g. here. But for AWS IOT API throttling limit, there are using a metric called Transactions per seconds. Is there an actual difference between "Transactions per Second" and "Requests per second" metrics or they are just the same?
They mean the same thing — the rate in which you're allowed to call the API. It seems there's no standard for this term; it's chosen at the discretion of the writers. Some services only state a plain number, i.e. 1000, others use requests, and a few use transactions.
we would like to create serverless architecture for our startup and we would like to support up to 1 million requests per second and 50 millions active users. How can we handle this use case with AWS architecture?
Regarding to AWS documentation API Gateway can handle only 10K requests/s and lamda can process 1K invocations/s and for us this is unacceptable.
How can we overcome this limitation? Can we request this throughput with AWS support or can we connect somehow to another AWS services (queues)?
Thanks!
Those numbers you quoted are the default account limits. Lambda and API Gateway can handle more than that, but you have to send a request to Amazon to raise your account limits. If you are truly going to receive 1 million API requests per second then you should discuss it with an AWS account rep. Are you sure most of those requests won't be handled by a cache like CloudFront?
The gateway is NOT your API Server. Lambda's are the bottleneck.
While the gateway can handle 100000 messages/sec (because it is going through a message queue), Lambdas top out at around 2,200 rps even with scaling (https://amido.com/blog/azure-functions-vs-aws-lambda-vs-google-cloud-functions-javascript-scaling-face-off/)
This differs dramatically from actually API framework implementations wherein the scale goes up to 3,500+ rps...
I think you should go with Application Load Balancer.
It is limitless in terms of RPS and can potentially be even cheaper for a large number of requests. It does have fewer integrations with AWS services though, but in general, it has everything you need for a gateway.
https://dashbird.io/blog/aws-api-gateway-vs-application-load-balancer/
I'm trying to figure out where the latency in my calls is coming from, please let me know if any of this information could be presented in a format that is more clear!
Some background: I have two systems--System A and System B. I manually (through Postman) hit an endpoint on System A that invokes an endpoint on System B.
System A is hosted on an EC2 instance.
When System B is hosted on a Lambda function behind API Gateway, the
latency for the call is 125 ms.
When System B is hosted on an
EC2 instance, the latency for the call is 8 ms.
When System B is
hosted on an EC2 instance behind API Gateway, the latency for the
call is 100 ms.
So, my hypothesis is that API Gateway is the reason for increased latency when it's paired with the Lambda function as well. Can anyone confirm if this is the case, and if so, what is API Gateway doing that increases the latency so much? Is there any way around it? Thank you!
It might not be exactly what the original question asks for, but I'll add a comment about CloudFront.
In my experience, both CloudFront and API Gateway will add at least 100 ms each for every HTTPS request on average - maybe even more.
This is due to the fact that in order to secure your API call, API Gateway enforces SSL in all of its components. This means that if you are using SSL on your backend, that your first API call will have to negotiate 3 SSL handshakes:
Client to CloudFront
CloudFront to API Gateway
API Gateway to your backend
It is not uncommon for these handshakes to take over 100 milliseconds, meaning that a single request to an inactive API could see over 300 milliseconds of additional overhead. Both CloudFront and API Gateway attempt to reuse connections, so over a large number of requests you’d expect to see that the overhead for each call would approach only the cost of the initial SSL handshake. Unfortunately, if you’re testing from a web browser and making a single call against an API not yet in production, you will likely not see this.
In the same discussion, it was eventually clarified what the "large number of requests" should be to actually see that connection reuse:
Additionally, when I meant large, I should have been slightly more precise in scale. 1000 requests from a single source may not see significant reuse, but APIs that are seeing that many per second from multiple sources would definitely expect to see the results I mentioned.
...
Unfortunately, while cannot give you an exact number, you will not see any significant connection reuse until you approach closer to 100 requests per second.
Bear in mind that this is a thread from mid-late 2016, and there should be some improvements already in place. But in my own experience, this overhead is still present and performing a loadtest on a simple API with 2000 rps is still giving me >200 ms extra latency as of 2018.
source: https://forums.aws.amazon.com/thread.jspa?messageID=737224
Heard from Amazon support on this:
With API Gateway it requires going from the client to API Gateway,
which means leaving the VPC and going out to the internet, then back
to your VPC to go to your other EC2 Instance, then back to API
Gateway, which means leaving your VPC again and then back to your
first EC2 instance.
So this additional latency is expected. The only way to lower the
latency is to add in API Caching which is only going to be useful is
if the content you are requesting is going to be static and not
updating constantly. You will still see the longer latency when the
item is removed from cache and needs to be fetched from the System,
but it will lower most calls.
So I guess the latency is normal, which is unfortunate, but hopefully not something we'll have to deal with constantly moving forward.
In the direct case (#2) are you using SSL? 8 ms is very fast for SSL, although if it's within an AZ I suppose it's possible. If you aren't using SSL there, then using APIGW will introduce a secure TLS connection between the client and CloudFront which of course has a latency penalty. But usually that's worth it for a secure connection since the latency is only on the initial establishment.
Once a connection is established all the way through, or when the API has moderate, sustained volume, I'd expect the average latency with APIGW to drop significantly. You'll still see the ~100 ms latency when establishing a new connection though.
Unfortunately the use case you're describing (EC2 -> APIGW -> EC2) isn't great right now. Since APIGW is behind CloudFront, it is optimized for clients all over the world, but you will see additional latency when the client is on EC2.
Edit:
And the reason why you only see a small penalty when adding Lambda is that APIGW already has lots of established connections to Lambda, since it's a single endpoint with a handful of IPs. The actual overhead (not connection related) in APIGW should be similar to Lambda overhead.
Do I get charged for publishing a message to an endpoint, even though this one is saved as not enabled on AWS SNS. Do I need to do a clean up job on my side to make sure that all the endpoints linked to a user are always up to date?
Thanks.
So you're charged for each authenticated API Request, ontop of additional pricing per Notification etc.
It's currently charged at $0.50 per 1 million API Requests, regardless if it returns a HTTP 200 or not.
First 1 million Amazon SNS requests per month are free
$0.50 per 1 million Amazon SNS requests thereafter
Source