DynamoDB documentation from Amazon says:
One write capacity unit represents one write per second for an item up
to 1 KB in size. If you need to write an item that is larger than 1
KB, DynamoDB must consume additional write capacity units.
But what exactly does a write mean? For example, I have an item with 2KB size. I need to update only 1 field, say, a number attribute in the item, which is surely less than 100 bytes. Does Amazon count this as 1 write unit or 2 write units? I think the total item size matters (which means 2 write units), but I just have to be sure.
Thanks for the help.
The size of any read or write is the total size of the item regardless of how many attributes you read or write.
The only (sort of) exception is global secondary indexes, where the size of the read is the total size of only the attributes of the item that are projected into that index.
Related
According to the documentation, one RCU means one strongly consistent read of 1 item of upto 4KB size and two eventually consistent reads of 2 items of upto 4KB size each.
I have tried deriving a formula for finding RCU:
Number of RCUs = (number of items read per second * ceil(number of blocks per item))/(number of blocks read per second)
Here a block is what can be read by dynamodb in a single read i.e 4KB. For strongly consistent reads number of blocks read per second will be 1 and for weakly consistent reads it will be 2.
I have tried my formula for this example:
If your table item’s size is 5KB and you want to have 90 strongly consistent reads per second, how many read capacity units will you need to provision on the table?
Here,
number of items read per second will be 90
number of blocks per item will be ceil(5KB/4KB) = 2
Since it is strongly consistent reads the number of blocks read per second will be 1.
So, number of RCUs = (90*2)/1 = 180 units, which looks correct to me.
Is my understanding and the formula I have created correct?
One thing I am confused about is the units of RCU. From the formula it looks like it does not have any units. Why is that?
Scenario:
If I read/ write an item of 10Bytes, Dynamo DB rounds up the throughput to 4Kb for read and 1Kb for write. If my entire DB consists of items which are 10-50 Bytes and I expect around 10 read/write operations per second, it becomes very inefficient.
Question:
Is there a way to overcome this and use the entire potential of every throughput
Here are the rules for "Capacity Unit Consumption for Reads":
GetItem—reads a single item from a table. To determine the number of
capacity units GetItem will consume, take the item size and round it
up to the next 4 KB boundary. If you specified a strongly consistent
read, this is the number of capacity units required. For an eventually
consistent read (the default), take this number and divide it by two.
For example, if you read an item that is 3.5 KB, DynamoDB rounds the
item size to 4 KB. If you read an item of 10 KB, DynamoDB rounds the
item size to 12 KB.
see https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/CapacityUnitCalculations.html
So maybe you could switch to eventually consistent read.
For PutItem and UpdateItem:
For PutItem, UpdateItem, and DeleteItem operations, DynamoDB rounds
the item size up to the next 1 KB. For example, if you put or delete
an item of 1.6 KB, DynamoDB rounds the item size up to 2 KB.
The dynamodb table states that:
One read capacity unit represents one strongly consistent read per second, or two eventually consistent reads per second, for an item up to 4 KB in size. If you need to read an item that is larger than 4 KB, DynamoDB will need to consume additional read capacity units. The total number of read capacity units required depends on the item size, and whether you want an eventually consistent or strongly consistent read.
If I assume I have items(rows) with size of 4KB, and this table has 1 million records.
I want to query this table 1000 times for individual items with a lambda function, and I would like it to be done in four second.
My thinking is that:
1000 items in four seconds, means 250 items read per second.
Since one RCU does two eventually consistent reads per second, I would need 125 RCUs.
Is this correct in thinking?
Furthermore, let's say that two people want to query 1000 items in four seconds at the same time. Does this mean, I would need 250 RCUs?
I also have a lambda function, which writes to this same table on a schedule, It first get's some values from an API and parses it with JSON, then inserts into table.
This lambda function will insert 60 records every hour. Each record will be 4KB, does this mean I would need 240 WCUs to write all 60 in one second?
Due to:
One write capacity unit represents one write per second for an item up to 1 KB in size. If you need to write an item that is larger than 1 KB, DynamoDB will need to consume additional write capacity units. The total number of write capacity units required depends on the item size.
Lets say, I have several items in the dynamodb with the same partition-key and different sort-keys.
Is there any difference between consumed read capacity units if I query the records using a sort-key constraint in a single go v/s query each item individually? Assume that the number of sort-keys to be fetched at-a-time are around 50. The official-documentation says that
One read capacity unit represents one strongly consistent read per
second, or two eventually consistent reads per second, for an item up
to 4 KB in size.
From this definition, it doesn't seem that there should be a difference since this definition is independent of how we query the database.
Apart from additional network delay, does the second approach have any other downside?
Please note that the costing is based on Read Capacity Units (RCU) and Write Capacity Units (WCU).
RCU formula:-
RCU = read capacity unit per item × number of reads per second
Before going into the below calculation, calculate the item size. You can get the item size from AWS console.
Go to the dynamodb table on AWS console --> Overview tab --> See at the bottom.
Lets talk about RCU. In the above case,
Scenario 1 - Getting all the data in one go using hash key only:-
In this scenario, the number of items read will be high (i.e. 50 items data). Calculate the size and check how many RCU required.
Scenario 2 - Getting the data multiple times using hash key and sort key:-
In this scenario, the API will be called multiple times. So, the number of reads per second will go up. Calculate the number of reads required and check how many RCU required.
Compare the RCU calculated in scenario 1 and 2. Choose the option which has less RCU in order to save cost.
As per DynamoDB ReadWriteCapacity
Units of Capacity required for writes = Number of item writes per
second x item size in 1KB blocks
Units of Capacity required for reads* = Number of item reads per
second x item size in 4KB blocks
If you use eventually consistent reads you’ll get twice the throughput in terms of reads per second.
If your items are less than 1KB in size, then each unit of Read
Capacity will give you 1 strongly consistent read/second and each unit
of Write Capacity will give you 1 write/second of capacity. For
example, if your items are 512 bytes and you need to read 100 items
per second from your table, then you need to provision 100 units of
Read Capacity.
I am confused with 4kb blocks and 1kb example mentioned above. If an item is 512 bytes, will it be rounded to 4kb and hence 1 read unit allows 1 item read/second? I assumed Item will be rounded to 1kb and hence 1 read capacity results in reading 4 items/seconds (and 8 items/second with eventual consistent). Is this assumption correct?
Let ceil() be a function that rounds non-integer values up to the next highest integer.
1 write unit allows you to write 1 / ceil(item_size / 1kB) items per second.
1 read unit allows you to read 1 / ceil(item_size / 4kB) items per second.
So, for example:
48 write capacity units allows 48 writes of items up to 1 kB, or 24 writes of items over 1kB up to 2kB, or 16 writes of items over 2kB up to 3kB, etc.
48 read capacity units allows you to read 48 items up to 4kB, or 24 items over 4kB up to 8kB.
You can't do more than your subscribed rate, and you may only be able to do less, if the items exceed the block size for the operation in question.
If your items are less than 1KB in size, then each unit of Read Capacity will give you 1 strongly consistent read/second and each unit of Write Capacity will give you 1 write/second of capacity.
This is accurate because items that are <= 1kB (the write block size) are also =< 4kB (the read block size) by definition.