When does DynamoDB throttle request? - amazon-web-services

In the answer to "How is Amazon DynamoDB throughput calculated and limited?" it's been suggested, that DynamoDB throttles request whenever you exceed provisioned throughput on per second basis. However, this contradicts my experience.
I've table where I post multiple rows, often the number of rows way exceeding provisioned write capacity. This happens in short bursts. At one point I've even got 5 minutes average above provisioned capacity. OTOH, 15 minutes average is below capacity. I haven't got any throttled request in that period.
5 minutes average peaks at 8.053 with provisioned capacity of 6:
15 minutes average peaks well below provisioned capacity:
So when does DynamoDB throttle requests? What kind of average does it take in account? How high above provisioned capacity can the burst be before it gets throttled?

DynamoDB is designed to ensure that your provisioned capacity is available on a per-second basis. If you provision a table for ten 1kB reads per second then DynamoDB will give you enough capacity to handle that throughput rate. In addition, DynamoDB will sometimes allow you to achieve limited bursting above your provisioned throughput for a short period of time. This is intended to absorb natural variations in customer workloads. This bursting is not guaranteed and it is not always available (and the nature of the available bursting may change over time). As is currently described in the best practices documentation, in order to get the best performance you should have an evenly distributed workload that does not exceed your provisioned capacity and distributes the load evenly over the key space. However, if the reality of production behavior for your application deviates from an evenly distributed workload then DynamoDB may absorb some of the bursts.
As for how much to provision your table, it depends a lot on your workload. You could start with provisioning to something like 80% of your peaks and then adjust your table capacity depending on how many throttles you receive (which you can see in your CloudWatch graphs) and your application’s tolerance for latency induced by retries. Keep in mind that DynamoDB does not allow unlimited bursts above your provisioned capacity. You may be able to absorb short bursts but you cannot sustain a throughput rate above your provisioned capacity level for an extended period of time. The general guidance we can give is to provision for something close to your peaks and then dial down while watching for throttles.
This answer was posted in AWS forums
Disclaimer: I work for Amazon, DynamoDB team.

There's a hint in the DynamoDB documentation that explains how bursting works:
When you are not fully utilizing a partition's throughput, DynamoDB retains a portion of your unused capacity for later bursts of throughput usage. DynamoDB currently retains up five minutes (300 seconds) of unused read and write capacity.
But it also says that you cannot rely on this behavior:
However, do not design your application so that it depends on burst capacity being available at all times: DynamoDB can and does use burst capacity for background maintenance and other tasks without prior notice.
At least that would explain why it was possible to have a 5 minute average above the provisioned capacity. With the explanation above, it would even be possible to have 15 minute averages (or longer timespans) to be above the provisioned capacity, if you have a spike in the very beginning of the interval and less usage within the 300 seconds before the start of the interval.

DynamoDB provides some flexibility in your per-partition throughput provisioning by providing burst capacity. Whenever you're not fully using a partition's throughput, DynamoDB reserves a portion of that unused capacity for later bursts of throughput to handle usage spikes.
DynamoDB currently retains up to 5 minutes (300 seconds) of unused read and write capacity. During an occasional burst of read or write activity, these extra capacity units can be consumed quickly—even faster than the per-second provisioned throughput capacity that you've defined for your table.
DynamoDB can also consume burst capacity for background maintenance and other tasks without prior notice.

Related

Do DynamoDB consumed capacity units in on-demand mode compare to provisioned capacity?

Right now I am using on-demand mode for my DynamoDB tables, as I didn't know how much data to expect. But now that the application has run a while, I can see the metrics for ConsumedReadCapacityUnits and ConsumedWriteCapacityUnits for my tables in CloudWatch.
In on-demand mode I pay per request, whereas in provisioned capacity mode I have to pay for the provisioned capacity. If I simply take the metrics for (max) consumed capacity units and compare the prices of those in provisioned capacity mode to my current costs, I believe provisioned capacity mode would be a lot cheaper for me.
My question is, can I simply take the metrics and take the max (plus some buffer) of the consumed capacity units and configure them as provisioned capacity, or is that an error in reasoning on my part?
There are two other things you need to consider:
How 'bursty' is your throughput?
Are you using SDKs to connect to your database?
Setting your provision to the maximum throughput you ever see will ensure you don't get throttled requests, however you will probably be setting the provision too high. Dynamodb can actually consume more provision than you have set using Burst Capacity. This will accomodate short bursts of high throughput over the space of 5 minutes. If you see sustained peaks, for example your database is busy in the day but not the night, you might consider setting your tables to Autoscale. In this case you can set the provisioned throughput lower, and Dyanmodb will automatically scale up provision as required. Note that autoscale is good for workloads that vary over the course of hours (e.g. for handling daily peak hours). It's not good for reacting to events that occur in less than about 30 minutes.
If you are using official SDKs, they will handle throttle responses, and retry any failed requests. This gives Dynamodb some time to scale without your application failing requests.

How to compute initial Auto-scaling limits for DynamoDb table

Our table has bursty writes, expected once a week. We have auto-scaling enabled, with provisioned capacity as 5 WCU's, with 70% target utilization. This suffices for our off-peak (non-bursty) traffic. However, during the bursty writes, the WCU's reach around 1.5-2k, which leads to a lot of throttled writed and ultimately failures to write as well.
1) Is the auto-scaling suitable for such an use-case?
2) If yes, what should our initial provisioned capacity be?
This answer will tell you why auto-scaling is not working for you:
https://stackoverflow.com/a/53005089/4985580
This answer will tell you how you can configure your SDK to retry operations over a much longer period (and therefore stop your operation failures furing peak requests).
What should be done when the provisioned throughput is exceeded?
Ultimately you should probably move your tables to on-demand.
For tables using on-demand mode, DynamoDB instantly accommodates
customers’ workloads as they ramp up or down to any previously
observed traffic level. If the level of traffic hits a new peak,
DynamoDB adapts rapidly to accommodate the workload.
No, auto-scaling is not suitable for your needs. It takes a few minutes to scale up and it does that by increasing a fixed percentage of your current capacity at each time. There's also a limited number of times it scales up or down per day, so you can't get from 5 to 2,000 in a matter of minutes. You may not even get that in a matter of hours.
I'd suggest to try on demand mode, or manually setting capacity to 2,000 some time before you actually need it (it doesn't really scale instantly).
I strongly advise to read the ENTIRE dynamo documentation with regards to best practices for primary key, GSI, data architecture. Depending on the size of your table (lager that 10 Gb), the 2,000 units may get spread across partitions and you could potentially still have throttled requests.

Hot partition problem in DynamoDB gone with the new on-demand feature?

I read the following announcement with great interest.
https://aws.amazon.com/about-aws/whats-new/2018/11/announcing-amazon-dynamodb-on-demand/
The new "on-demand" feature really helps with capacity planning. Reading the documentation, I can't really see if they do some "magic" to resolve the problem of hot partitions, and partition key distribution.
Is partition key design just as important if you provision a table "on-demand"?
Yes, partition key design is just as important. That aspect has not changed.
Since you mentioned adaptive capacity in a comment, one thing to make sure is clear. Once it is on for a table, it is on and DynamoDB is monitoring your table.
There are two features at play here:
* On-demand capacity mode
* Adaptive capacity
On-demand capacity mode allows you to pay per each request to DynamoDB instead of provisioning a particular amount of RCUs/WCUs (this is called provisioned capacity). The benefit is that you only pay for what you use (and not for what you provision), but the downside is that if you receive a constant flow of requests, you would end up paying more if you provisioned the right amount of RCUs/WCUs. The on-demand capacity mode is the best suit for spiky traffic, while the provisioned mode is better for applications with a constant, predictable stream of requests
Adaptive capacity is a different feature, and it can work with either on-demand or provisioned capacity modes. It allows to "borrow" unused capacity from other partitions if one of your partitions receive a higher share of requests. It used to take some time to enable adaptive capacity, but as for now, adaptive capacity is enabled immediately.
Even with adaptive capacity, a good key design is still important. It only helps with cases when it is hard to achieve a balanced distribution of requests among shards. A single partition in DynamoDB can only handle up to 3K RCUs and 1K WCUs. So if a single partition receives more than that even with adaptive capacity requests will be throttled. So you have to design your keys to avoid this scenario.
As of 5/2019 the answer to this question has changed. I'd like to preface my answer by saying I have not validated this against a production workload. Also my answer assumes you, like the OP, are using on-demand pricing.
First a general understanding of how DynamoDB (DDB) adaptive capacity works can be gleamed by reading this. Adaptive capacity is on by default. In short, when a hot partition exceeds its throughput capacity, DDB "moves the rudder" and instantly increases throughput capacity on the partition.
Before 5/2019 you'd 300 seconds of instant burst capacity, then you'd be throttled until adaptive capacity fully "kicked in" (5-30 minutes).
On 5/23/2019 AWS announced that adaptive capacity is now instant. This means no more 5-30 minute wait.
Does this mean if you use DDB on-demand pricing, your hot partition problems go away? Yes and no.
Yes, in that you should not get throttled.
No, in that your bank account will take the hit. In order to not get throttled, DDB will scale up on-demand (now instantly). You will pay for the RCUs and WCUs needed to handle the throughput.
So the moral is, you still have to think about hot partitions and design your application to avoid them as much as possible. You won't be paying for it in downtime/unhappy customers, you'll be paying for it out of profits.
#Glenn First of all thankyou for the great question, after some research i have reached to the conclution that hot partition problem is still important but only for 5-30 minutes as soon as dynamo db will detect you are having hot partitions it will use the mechanism like adaptive capacity and automatic resharding, dynamo db has improved a lot since its launch and now AWS handles hot partitions by something called automatic resharding, i think automatic resharding works in both on demand and provision model but i could not find any proof for that i will update the answer as soon as i find it for reference you can watch this keynote.
AWS reinvent 2018 keynote

How to calculate Target Utilization in DynamoDB table?

We know the minimum, maximum provisioned capacity for a certain table.
For example our minimum capacity is 200 reads per second and maximum is 1000 read per second, so what should be the target utilization percentage ?
Some background for a complete answer; DynamoDB provides an Autoscaling option for managing throughput capacity. With autoscaling you define a minimum, maximum and target utilization.
DynamoDB Autoscaling will then vary the provisioned throughput between the maximum and mimumum bounds set. It will aim to keep this throughput provision at the utilization capacity.
Target utilization is the ratio of consumed capacity units to
provisioned capacity units, expressed as a percentage
A good starting point is to ask why not set target utilization to 100%? This sounds efficient, because you will only be paying for the throughput you use. But there is a problem to this:
DynamoDB auto scaling modifies provisioned throughput settings only
when the actual workload stays elevated (or depressed) for a sustained
period of several minutes
So, imagine your target utilization is 100% and you have increased demand on your table for 15 minutes. For the first 5 minutes you might be saved by burst capacity, in the second lot of 5 minutes you are likely to see database read/write failures as your throughput is exceeded, and then after around 10 minutes Autoscaling should kick in and increase your throughput.
This is the problem you are trying to avoid by setting target utilization (i.e. an increase in demand causing throttling). You need to consider two things
1) What is the biggest change in throughput capacity usage you see over a time period of 15 minutes expressed as a percentage? Leave this amount of room in your target utilization.
2) How much do you care if you have some database throttling? (i.e. some database read/writes fail?) Adjust your target utilization higher or lower depending on your appetite for cost saving versus throttling.
Lets say you look over one week of data, and find that in a 15 minute period, the largest increase in throughput you see is 20%. That gives you a target utilization of 80% (because then your increased demand is absorbed by autoscaling)*. However lets say you are cautious and you really aren't OK with database throttling, so to be on the safe side, you might go with 70%.
Hope that helps make some decisions. In summary, your target utilization should be a function of how quickly your throughput capacity changes, and how averse you are to throttling.
EDIT:*The maths isn't perfect here, but you get the idea I think. And its probably a close enough approximation.

Avoid throttle dynamoDB

I am new to cloud computing, but had a question if a mechanism as what I am about to describe exists or is possible to create?
Dynamodb has provisioned throughput (eg. 100 writes/second). Of course, in real world application real life throughput is very dynamic and will almost never be your provisioned amount of 100 writes/second. I was thinking what would be great would be some type of queue for dynamodb. For example, my dynamodb during peak hours may receive 500 write requests per second (5 times what I have allocated) and would return errors. Is it there some queue I can put in between the client and database, so the client requests go to the queue, the client gets acknowledged their request has been dealt with, then the queue spits out the request to the dynamodb at a rate of 100/ writes per second exactly, so that way there are no error returned and I don't need to raise the through put which will raise my costs?
Putting AWS SQS is front of DynamoDB would solve this problem for you, and is not an uncommon design pattern. SQS is already well suited to scale as big as it needs to, and ingest a large amount of messages with unpredictable flow patterns.
You could either put all the messages into SQS first, or use SQS as an overflow buffer when you exceed the design thoughput on your DynamoDB database.
One or more worker instances can than read messages from the SQS queue and put them into DynamoDB at exactly the the pace you decide.
If the order of the messages coming in is extremely important, Kinesis is another option for you to ingest the incoming messages and then insert them into DynamoDB, in the same order they arrived, at a pace you define.
IMO, SQS will be easier to work with, but Kineses will give you more flexibility if your needs are more complicated.
This cannot be accomplished using DynamoDB alone. DynamoDB is designed for uniform, scalable, predictable workloads. If you want to put a queue in front of DynamoDB you have do that yourself.
DynamoDB does have a little tolerance for burst capacity, but that is not for sustained use. You should read the best practices section Consider Workload Uniformity When Adjusting Provisioned Throughput, but here are a few, what I think are important, paragraphs with a few things emphasized by me:
For applications that are designed for use with uniform workloads, DynamoDB's partition allocation activity is not noticeable. A temporary non-uniformity in a workload can generally be absorbed by the bursting allowance, as described in Use Burst Capacity Sparingly. However, if your application must accommodate non-uniform workloads on a regular basis, you should design your table with DynamoDB's partitioning behavior in mind (see Understand Partition Behavior), and be mindful when increasing and decreasing provisioned throughput on that table.
If you reduce the amount of provisioned throughput for your table, DynamoDB will not decrease the number of partitions . Suppose that you created a table with a much larger amount of provisioned throughput than your application actually needed, and then decreased the provisioned throughput later. In this scenario, the provisioned throughput per partition would be less than it would have been if you had initially created the table with less throughput.
There are tools that help with auto-scaling DynamoDB, such as sebdah/dynamic-dynamodb which may be worth looking into.
One update for those seeing this recently, for having burst capacity DynamoDB launched on 2018 the On Demand capacity mode.
You don't need to decide on the capacity beforehand, it will scale read and write capacity following the demand.
See:
https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-no-capacity-planning-and-pay-per-request-pricing/