What does the "validDuration", "MaxAmount" parameter mean when using the FIXED_WINDOW algorithm?
As far as I understand from the istio github repository about redis quota, there is no difference in validDuration and MaxAmount in rolling_window and fixed_window. The difference I found is there and it´s about bucket duration.
The bucketDuration will be ignored if rateLimitAlgorithm is FIXED_WINDOW
And from documentation
FIXED_WINDOW -> The fixed window approach can allow 2x peak specified rate, whereas the rolling-window doesn’t.
ROLLING_WINDOW -> The rolling window algorithm’s additional precision comes at the cost of increased redis resource usage.
Take a look at redisquota code about max_amount and validDuration.
So i think the answer for your question is, Quoted from the older docs and above github repository code:
maxAmount -> int64 -> The upper limit for this quota.
validDuration -> Duration -> The amount of time allocated quota remains valid before it is automatically released. This is only meaningful for rate limit quotas, otherwise the value must be zero.
Hope you find this useful.
Related
I was trying to understand Horizontal Pod Autoscaling from the documentation. The initial examples explaining usage of different attributes to configure the scaling behavior was clear.
However, this one section called default behavior has an example showing a very counter-intuitive example for scale down configuration.
Here's just the scale down part of the scaling behavior:
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
and here's their explanation:
For scaling down the stabilization window is 300 seconds (or the value of the --horizontal-pod-autoscaler-downscale-stabilization flag if provided). There is only a single policy for scaling down which allows a 100% of the currently running replicas to be removed which means the scaling target can be scaled down to the minimum allowed replicas.
If I understand correctly stabilization essentially looks at the previously computed desired states from the immediate previous duration which is configured via the stabilization flag. In the meanwhile, the policies attribute right below stabilization, from the code block quoted above, describes a radically different behavior.
So I have some questions:
What exactly is the rolling maximum referred from under the stabilization window section of the documentation?This question stems from the following:
a) From the given explanation in the docs, why does the desired state
have to be inferred? Because isn't the desired state supposed to be
a fixed threshold that the current state either subceeds or exceeds?
b) Why would there be a need to average over anything other than the
previous state (before the current state) or the previously income traffic?
How does stabilization exactly implement a rolling maximum over a period of 300 seconds if at the same time there's another policy which is to scaled down all the way to the minimum allowed replicas within a drastically shorter duration of 15 seconds?
I know my questions might reflect an incomplete understanding but please do help me getting the necessary intuition to work with HPA. Thanks!
The problem: very frequent "403 Request throttled due to too many requests" errors during data indexing which should be a memory usage issue.
The infrastructure:
Elasticsearch version: 7.8
t3.small.elasticsearch instance (2 vCPU, 2 GB memory)
Default settings
Single domain, 1 node, 1 shard per index, no replicas
There's 3 indices with searchable data. 2 of them have roughly 1 million documents (500-600 MB) each and one with 25k (~20 MB). Indexing is not very simple (has history tracking) so I've been testing refresh with true, wait_for values or calling it separately when needed. The process is using search and bulk queries (been trying sizes of 500, 1000). There should be a limit of 10MB from AWS side so these are safely below that. I've also tested adding 0,5/1 second delays between requests, but none of this fiddling really has any noticeable benefit.
The project is currently in development so there is basically no traffic besides the indexing process itself. The smallest index generally needs an update once every 24 hours, larger ones once a week. Upscaling the infrastructure is not something we want to do just because indexing is so brittle. Even only updating the 25k data index twice in a row tends to fail with the above mentioned error. Any ideas how to reasonably solve this issue?
Update 2020-11-10
Did some digging in past logs and found that we used to have 429 circuit_breaking_exception-s (instead of the current 403) with a reason among the lines of [parent] Data too large, data for [<http_request>] would be [1017018726/969.9mb], which is larger than the limit of [1011774259/964.9mb], real usage: [1016820856/969.7mb], new bytes reserved: [197870/193.2kb], usages [request=0/0b, fielddata=0/0b, in_flight_requests=197870/193.2kb, accounting=4309694/4.1mb]. Used cluster stats API to track memory usage during indexing, but didn't find anything that I could identify as a direct cause for the issue.
Ended up creating a solution based on the information that I could find. After some searching and reading it seemed like just trying again when running into errors is a valid approach with Elasticsearch. For example:
Make sure to watch for TOO_MANY_REQUESTS (429) response codes
(EsRejectedExecutionException with the Java client), which is the way
that Elasticsearch tells you that it cannot keep up with the current
indexing rate. When it happens, you should pause indexing a bit before
trying again, ideally with randomized exponential backoff.
The same guide has also useful information about refreshes:
The operation that consists of making changes visible to search -
called a refresh - is costly, and calling it often while there is
ongoing indexing activity can hurt indexing speed.
By default, Elasticsearch periodically refreshes indices every second,
but only on indices that have received one search request or more in
the last 30 seconds.
In my use case indexing is a single linear process that does not occur frequently so this is what I did:
Disabled automatic refreshes (index.refresh_interval set to -1)
Using refresh API and refresh parameter (with true value) when and where needed
When running into a "403 Request throttled due to too many requests" error the program will keep trying every 15 seconds until it succeeds or the time limit (currently 60 seconds) is hit. Will adjust the numbers/functionality if needed, but results have been good so far.
This way the indexing is still fast, but will slow down when needed to provide better stability.
Introduction
We are trying to "measure" the cost of usage of a specific use case on one of our Aurora DBs that is not used very often (we use it for staging).
Yesterday at 18:18 hrs. UTC we issued some representative queries to it and today we were examining the resulting graphs via Amazon CloudWatch Insights.
Since we are being billed USD 0.22 per million read/write IOs, we need to know how many of those there were during our little experiment yesterday.
A complicating factor is that in the cost explorer it is not possible to group the final billed costs for read/write IOs per DB instance! Therefore, the only thing we can think of to estimate the cost is from the read/write volume IO graphs on CLoudwatch Insights.
So we went to the CloudWatch Insights and selected the graphs for read/write IOs. Then we selected the period of time in which we did our experiment. Finaly, we examined the graphs with different options: "Number" and "Lines".
Graph with "number"
This shows us the picture below suggesting a total billable IO count of 266+510=776. Since we have choosen the "Sum" metric, this we assume would indicate a cost of about USD 0.00017 in total.
Graph with "lines"
However, if we choose the "Lines" option, then we see another picture, with 5 points on the line. The first and last around 500 (for read IOs) and the last one at approx. 750. Suggesting a total of 5000 read/write IOs.
Our question
We are not really sure which interpretation to go with and the difference is significant.
So our question is now: How much did our little experiment cost us and, equivalently, how to interpret these graphs?
Edit:
Using 5 minute intervals (as suggested in the comments) we get (see below) a horizontal line with points at 255 (read IOs) for a whole hour around the time we did our experiment. But the experiment took less than 1 minute at 19:18 (UTC).
Wil the (read) billing be for 12 * 255 IOs or 255 ... (or something else altogether)?
Note: This question triggered another follow-up question created here: AWS CloudWatch insights graph — read volume IOs are up much longer than actual reading
From Aurora RDS documentation
VolumeReadIOPs
The number of billed read I/O operations from a cluster volume within
a 5-minute interval.
Billed read operations are calculated at the cluster volume level,
aggregated from all instances in the Aurora DB cluster, and then
reported at 5-minute intervals. The value is calculated by taking the
value of the Read operations metric over a 5-minute period. You can
determine the amount of billed read operations per second by taking
the value of the Billed read operations metric and dividing by 300
seconds. For example, if the Billed read operations returns 13,686,
then the billed read operations per second is 45 (13,686 / 300 =
45.62).
You accrue billed read operations for queries that request database
pages that aren't in the buffer cache and must be loaded from storage.
You might see spikes in billed read operations as query results are
read from storage and then loaded into the buffer cache.
Imagine AWS report these data each 5 minutes
[100,150,200,70,140,10]
And you used the Sum of 15 minutes statistic like what you had on the image
F̶i̶r̶s̶t̶,̶ ̶t̶h̶e̶ ̶"̶n̶u̶m̶b̶e̶r̶"̶ ̶v̶i̶s̶u̶a̶l̶i̶z̶a̶t̶i̶o̶n̶ ̶r̶e̶p̶r̶e̶s̶e̶n̶t̶ ̶o̶n̶l̶y̶ ̶t̶h̶e̶ ̶l̶a̶s̶t̶ ̶a̶g̶g̶r̶e̶g̶a̶t̶e̶d̶ ̶g̶r̶o̶u̶p̶.̶ ̶I̶n̶ ̶y̶o̶u̶r̶ ̶c̶a̶s̶e̶ ̶o̶f̶ ̶1̶5̶ ̶m̶i̶n̶u̶t̶e̶s̶ ̶a̶g̶g̶r̶e̶g̶a̶t̶i̶o̶n̶,̶ ̶i̶t̶ ̶w̶o̶u̶l̶d̶ ̶b̶e̶ ̶(̶7̶0̶+̶1̶4̶0̶+̶1̶0̶)̶
Edit: First, the "number" visualization represent the whole selected duration, aggregated with would be the total of (100+150+200+70+140+10)
The "line" visualization will represent all the aggregated groups. which would in this case be 2 points (100+150+200) and (70+140+10)
It can be a little bit hard to understand at first if you are not used to data points and aggregations. So I suggest that you set your "line" chart to Sum of 5 minutes you will need to get value of each points and devide by 300 as suggested by the doc then sum them all
Added images for easier visualization
I have an SSM parameter created which will need a frequent update to it (almost 20 times a min). The SSM put_parameter API says it can return some error in case it exceeds the max limit versions.
ParameterMaxVersionLimitExceeded
The parameter exceeded the maximum number of allowed versions.
HTTP Status Code: 400
https://docs.aws.amazon.com/systems-manager/latest/APIReference/API_PutParameter.html
So my question is what is the maximum allowed version changes we can make and is that limit configurable?
official documentation reference will be highly helpful
For your question, there is a limit for storing the 100 past values for the parameter.
For more information, you can go through the below link:
https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html#limits_ssm
According to AWS Support, 100 is the soft AND hard limit as of March 2022:
Our service team has got back to us and has informed us that unfortunately 100 is the hard limit for this specific service.
They've determined that a parameter can have max 100 versions at a given time. Parameter Store rotates the history of the parameter, when version 101 is created, the oldest version is deleted. This should happen automatically. If you're seeing ParameterMaxVersionLimitExceeded, you may have a label associated with the oldest version. You may need to move the label to another version, to be able to overwrite the parameter.
In cloudfront I can set the amount of time before an image expires from cache. Are there limitations to this? Can I have 500 files(example) set to stay in cache for 1yr?
There are no restrictions. You can cache files up until 2038 if you want. Although practically, I would find it hard to believe that a file would last that long on an endpoint.
You can cache as many files as you want for as long as you want.
Actually there are limits.
I'll summarize a few:
Just fyi, invalidating is quick but their documentation states 10-15 minutes
Also, if you dig deeper, EDGES may hold the data up to 24hours, not necessarily serving it.
It currently states that you have 1000 Objects limit
Objects can be as simple as /mystats.json?user=124 , each query string can be treated as an object.
Max of 4000 characters for the object name.