I'm trying to enable API Gateway throttling, but it's not working as expected.
I set Default Method Throttling Rate to 1 request per second, and Burst to 1 request.
Then I created a loop in my code to make 10 simultaneous requests to my API endpoint.
for (let i=0; i<10; i++) {
axios.get(url);
}
The expected result would be:
1 successful request
9 throttled requests (HTTP 429 error)
But the actual result was the opposite:
9 successful requests
1 throttled request (HTTP 429 error)
I repeated the process, but making 20 simultaneous request and the result was:
16 successful requests
4 throttled requests (HTTP 429 error)
On CloudWatch logs for this API method, I found different Log streams, each one with only few milliseconds difference.
If I set Rate to 0 requests per second and Burst to 0 request, the throttling works and ALL requests get throttlet. But when I set Rate and Bust to 1 it does not work as expected.
Why is that happening? I need to limit my API to only 1 request per second.
It seems AWS API Gateway throttling is not very precise for small values of rate/burst.
I imagine that there are multiple "instances" of the API Gateway running, and the values of rate and burst are "eventually consistent".
However I did not find any documentation about that.
When I made an initial request and wait 500 milliseconds before making other 99 requests, the results were "less imprecise".
Example:
axios.get(url);
setTimeout(function(){
console.log("After 500 ms");
for (let i=0; i<99; i++) {
axios.get(url);
}
}, 500);
Results:
Once I got 1 success and 99 throttles.
Other time I got 12 success and 88 throttles.
Other time I got 33 success and 67 throttles.
However, it's difficult to have consistent results.
There are two ways to apply limits on API calls:
Account-level throttling
API-level and stage-level throttling
When you need to apply API-level or stage-level throttling, you have to use usage plans:
A usage plan specifies who can access one or more deployed API stages and methods—and also how much and how fast they can access them
Related
I'm using Artillery to run a small load test performance against a REST API (Edge endpoint) deployed with AWS API Gateway by using Serverless framework
This API has a custom domain/ACM certificate configured and since I'm using Edge endpoint type it also has a CloudFront.
This is the flow for the request:
Cloudfront -> API Gateway -> Lambda Authorizer -> Lambda -> Other services
Once I start running around 100 requests/per second in a period of 60 seconds (total of 6000 requests) the results are fine (only HTTP 202) but when I start with 200 requests/per second (total of 12000 requests) I start getting some errors described in Artillery as "ETIMEDOUT". By looking into CloudWatch logs I couldn't find any error related to that and there I'm only able to visualize the successful requests.
I went through both lambdas metrics that are part of my flow and the metrics are only showing the number of successful invocations as well and no error on lambdas execution, e.g. no lambda timeout.
For example, on Artillery report I get 9666 successful responses and this value is the same I found for the lambda invocations.
Artillery report (example):
errors.ETIMEDOUT: .............................................................. 2334
http.codes.202: ................................................................ 9666
http.request_rate: ............................................................. 179/sec
http.requests: ................................................................. 12000
http.response_time:
min: ......................................................................... 143
max: ......................................................................... 601
median: ...................................................................... 179.5
p95: ......................................................................... 407.5
p99: ......................................................................... 432.7
http.responses: ................................................................ 9666
vusers.completed: .............................................................. 9666
vusers.created: ................................................................ 12000
vusers.created_by_name.0: ...................................................... 12000
vusers.failed: ................................................................. 2334
vusers.session_length:
min: ......................................................................... 190
max: ......................................................................... 7530.3
median: ...................................................................... 237.5
p95: ......................................................................... 459.5
p99: ......................................................................... 507.8
Note: There is no pattern on this "error" results. Each execution generates a different amount of "ETIMEDOUT" errors.
Artillery yml test definition
config:
target: 'https://testing.mydomain.com'
phases:
- duration: 60
arrivalRate: 200
defaults:
headers:
Authorization: 'Bearer XXXXXX'
scenarios:
- flow:
- post:
url: "/create"
json:
clt: "{{ $randomString() }}"
value: "10"
prd: "abcdefg"
log: "Sending info to {{ $randomString() }}"
By checking CloudWatch metrics for API Gateway, it seems only the successfull requests (9666 in the example above) are reaching the API. I'm checking the "count" metric:
I'm wondering if there is any API limit that I couldn't find.
I believe you will be hitting this limit here potentially.
https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html
"10,000 requests per second (RPS) with an additional burst capacity provided by the token bucket algorithm, using a maximum bucket capacity of 5,000 requests. *
Note
The burst quota is determined by the API Gateway service team based on the overall RPS quota for the account in the Region. It is not a quota that a customer can control or request changes to."
I could be wrong, but worth checking these limit sets.
When setting throttling limits for our API, it appears that the Rate Limit works successfully but the Quota does not.
We created a subscription that limits to 10 requests/second, and when running tests, we obtain a 429 response upon sending an 11th query in one second, which is exactly what we want and expect.
However, the filter also has a Quota of 100 requests/minute, yet we are able to run over 100 requests (have tested up to 300 queries and still gotten entirely 200 response codes) in the span of a minute without getting throttled.
Ever since we started batching our requests our total requests in the App Dashboard API stats went down 50% a day, and our error rate grew by 200% a day.
If you get throttled doing batch requests... n number of requests will return an error. For example:
I make 50 requests in a batch and the first 20 requests are good. At request 21 our account gets throttled, so requests 21-50 all receive a throttling error.
Does this count as 30 errors or 1 error in the API stats?
From my understanding, API Gateway by default has a 1000 RPS limit--when this is crossed, it will begin throttling calls and returning 429 error codes. Past the Gateway, Lambda has a 100 concurrent invocation limit, and when this is crossed, it will begin throttling calls and returning 500 (or 502) error codes.
Given that, when viewing my graph on Cloudwatch, I would expect my number of throttled calls to be closer to the number of 4XX errors, or at least above the number of 5XX errors, because the calls must pass through API Gateway first in order to get to Lambda at all. However, it looks like the number of throttled calls is closer to the number of 5XX errors.
Is there something I might be missing from the way I'm reading the graph?
Depending on how long it takes for your Lambda function to execute and how spread are your requests you can hit Lambda limits way before or way after API Gateway throttling limits. I'd say the 2 metrics you are comparing are independent of each other.
According to the API Gateway Request documentation:
API Gateway limits the steady-state request rate to 10,000 requests per second (rps)
This means that per 100 milliseconds the API can process 1,000 requests.
The comments above are correct in stating that CloudWatch is not giving you the full picture. The actual performance of your system depends on both the runtime of your lambda and the number of concurrent requests.
To better understand what is going on I suggest a using the Lambda Load Tester seen in the following images or building your own.
Testing
The lambda used has the following properties:
Upon Invocation, it sleeps for 1 second and then exits.
Has a Reserved Concurrency limit of 25, meaning the lambda will only execute 25 concurrent instances. Any surplus will be returned with a 500 error.
Requests: 1000 Concurrent: 25
In the first test, we'll send 1000 requests in 40 batches of 25 requests each.
Command:
bash run.sh -n 1000 -c 25
Output:
Status code distribution:
[200] 1000 responses
Summary:
In this case, the number of requests was below both the lambda and API Gateways limits. All executions were successful.
Requests: 1000 Concurrent: 50
In the first test, we'll send 1000 requests in 20 batches of 50 requests each.
Command:
bash run.sh -n 1000 -c 50
Output:
Status code distribution:
[200] 252 responses
[500] 748 responses
Summary:
In this case, the number of requests was below both the API Gateways limit, so every request was passed to the lambda. However, 50 concurrent requests exceeded the limit of 25 we placed on the lambda, so about 75% of the requests returned a 500 error.
Requests: 800 Concurrent: 800
In this test, we'll send 800 requests in 1 batch of 800 requests each.
Command:
bash run.sh -n 800 -c 800
Output:
Status code distribution:
[200] 34 responses
[500] 765 responses
Error distribution:
[1] Get https://XXXXXXX.execute-api.us-east-1.amazonaws.com/dev/dummy: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Summary:
In this case, the number of requests was starting to push the limits of the API Gateway and you can see one of the requests timed out. The 800 concurrent requests well exceeded the 25 reserved concurrency limit we placed on the lambda and in this case, about 95% of the requests returned a 500 error.
Requests: 3000 Concurrent: 1500
In this test, we'll send 3000 requests in 2 batches of 1500 requests each.
Command:
bash run.sh -n 3000 -c 1500
Output:
Status code distribution:
[200] 69 responses
[500] 1938 responses
Error distribution:
[985] Get https://drlhus6zf3.execute-api.us-east-1.amazonaws.com/dev/dummy: dial tcp 52.84.175.209:443: connect: connection refused
[8] Get https://drlhus6zf3.execute-api.us-east-1.amazonaws.com/dev/dummy: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Summary:
In this case, the number of requests exceeded the limits of the API Gateway and several of the connection attempts were refused. Those that did pass through the Gateway were still met with the reserved concurrency limit we placed on the lambda and returned a 500 error.
Update: Keep-alive wasn't set on the AWS client. My fix was
var aws = require('aws-sdk');
aws.config.httpOptions.agent = new https.Agent({
keepAlive: true
});
I finally managed to debug it by using the Node --prof flag. Then using the node-tick-processor to analyze the output (it's a packaged version of a tool distributed in the Node/V8 source code). Most of the processing time was spent in SSL processing and that's when I thought to check whether or not is used keep-alive.
TL;DR Getting throttled by AWS when the number of requests is less than the configured DynamoDB throughput. Is there a request rate limit for all APIs?
I'm having a hard time finding documentation about the rate limiting of AWS APIs.
An application that I'm testing now is making about 80 requests per second to DynamoDB. This is a mix of PUTs and GETs. My DynamoDB table is configured with a throughput of: 250 reads / 250 writes. In the table CloudWatch metrics, the reads peak at 24 and the writes at 59 during the test period.
This is a sample of my response times. First, subsecond response times.
2015-10-07T15:28:55.422Z 200 in 20 milliseconds in request to dynamodb.us-east-1.amazonaws.com
2015-10-07T15:28:55.423Z 200 in 22 milliseconds in request to dynamodb.us-east-1.amazonaws.com
A lot longer, but fine...
2015-10-07T15:29:33.907Z 200 in 244 milliseconds in request to dynamodb.us-east-1.amazonaws.com
2015-10-07T15:29:33.910Z 200 in 186 milliseconds in request to dynamodb.us-east-1.amazonaws.com
The requests are piling up...
2015-10-07T15:32:41.103Z 200 in 1349 milliseconds in request to dynamodb.us-east-1.amazonaws.com
2015-10-07T15:32:41.104Z 200 in 1181 milliseconds in request to dynamodb.us-east-1.amazonaws.com
...no...
2015-10-07T15:41:09.425Z 200 in 6596 milliseconds in request to dynamodb.us-east-1.amazonaws.com
2015-10-07T15:41:09.428Z 200 in 5902 milliseconds in request to dynamodb.us-east-1.amazonaws.com
I went and got some tea...
2015-10-07T15:44:26.463Z 200 in 13900 milliseconds in request to dynamodb.us-east-1.amazonaws.com
2015-10-07T15:44:26.464Z 200 in 12912 milliseconds in request to dynamodb.us-east-1.amazonaws.com
Anyway, I stopped the test, but this is a Node.js application so a bunch of sockets were left open waiting for my requests to AWS to complete. I got response times > 60 seconds.
My DynamoDB throughput wasn't used much, so I assume that the limit is in API requests but I can't find any information on it. What's interesting is that the 200 part of the log entries is the response code from AWS which I got by hacking a bit of the SDK. I think AWS is supposed to return 429s -- all their SDKs implement exponential backoff.
Anyway -- I assumed that I could make as many requests to DynamoDB as configured throughput. Is that right? ...or what?