Cluster Sharding and Distributed Data - How big it can become? - akka

I have a question about Cluster Sharding and 'state-store-mode=ddata'....
We have an Actor System that has 10 000 000 Actors and we are using Cluster Sharding, out of the box it is configured for 'distributed data' ddata, while in the Cluster Sharding page it is written with big letters in here Persistence Mode Deprecated
Warning
Persistence for state store mode is deprecated. It is recommended to migrate to ddata for the coordinator state and if using replicated entities migrate to eventsourced for the replicated entities state.
The data written by the deprecated persistence state store mode for remembered entities can be read by the new remember entities eventsourced mode.
Once you’ve migrated you can not go back to persistence mode.
but I also found some articles in the internet that akka distributed data is not such a good idea for big systems, so I think 10 000 000 actors can be defined as big system....
Akka Distributed Data - Scaling
Akka Distributed Data - Large Data Set
So my questions are
Do you know what configuration parameters have an effect for the scaling of the clsuter sharding - distributed data
For Cluster Sharding, my experiments shows, when I have more shards, Sharding Distributed Data scales better. Is this an correct assumption...
If Distributed Data is not approperiate for this Actor numbers, should I stick to the 'persistence' mode, which in Akka Documentation marked as deprecated....
If the ddata and persistence are not way to go for this amount of actor what should use instead....
Thx for answers

Do you know what configuration parameters have an effect for the
scaling of the clsuter sharding - distributed data
Distributed data does not have any direct effect on the scaling of shards. It can handle up to 100000 entities, which results in supporting for up to 10s of thousands shards.
The communication from the client to the shard allocation strategy is
via Distributed Data. It uses a single LWWMap that can support 10s of
thousands of shards. Later versions could use multiple keys to support
a greater number of shards.
And you can control the amount of shards with following parameter
akka.cluster.sharding {
# Number of shards used by the default HashCodeMessageExtractor
# when no other message extractor is defined. This value must be
# the same for all nodes in the cluster and that is verified by
# configuration check when joining. Changing the value requires
# stopping all nodes in the cluster.
number-of-shards = 1000
}
It will create the specified amount of shards by following rule
By default the shard identifier is the absolute value of the hashCode
of the entity identifier modulo the total number of shards.
For Cluster Sharding, my experiments shows, when I have more shards,
Sharding Distributed Data scales better. Is this an correct assumption
Yes and no. Too few shards will have a problem of uneven entity allocations, too many shards will have a problem of shard allocation overhead. The golden number of shards is the amount of nodes times 10. See the same docs.
As a rule of thumb, the number of shards should be a factor ten
greater than the planned maximum number of cluster nodes. It doesn’t
have to be exact. Fewer shards than number of nodes will result in
that some nodes will not host any shards. Too many shards will result
in less efficient management of the shards, e.g. rebalancing overhead,
and increased latency because the coordinator is involved in the
routing of the first message for each shard.
If Distributed Data is not appropriate for this Actor numbers, should
I stick to the 'persistence' mode, which in Akka Documentation marked
as deprecated....
10 million of actors do not influence the size of the sharding state and thus it's appropriate to use ddata mode.

Related

Dynamodb streams: small number of items per batch

I have a very large dynamodb table, and I want to use lambda function triggered by a stream. I would like to work in big batches, of at least 1000 items. But when I connect the lambda, I see it is invoked with tiny batches of 1 or 2 items. I increased the window to 15 seconds, and it doesn't help.
I assume it's because the table has a lot of shards, and every batch gathers items from only one shard. Is this correct?
What can be done in order to increase the batch size?
I wrote a deep-dive blog post about the integration of DynamoDB Streams an Lambda (disclaimer, written by me on the company blog - very relevant to the question) - the images are taken from there.
DynamoDB Streams consist of shards that store a record of changes sequentially. Each storage partition in the table maps to at least one shard of a DynamoDB stream. The shards get split if a shard is full or the throughput is too high.
Conceptually, this is how the Lambda Service polls the stream shards:
Crucially, polling the shards happens in parallel, but batching is always per shard in order to maintain the order of changes and have consistent scale-out behavior.
This diagram shows how the configuration options in the event source mapping influence how processing happens.
Let's focus on your situation. If you have a large number of items, and relatively high throughput, chances are that DynamoDB allocates many storage partitions to handle that throughput. That automatically leads to a large number of stream shards (#shards >= #storage_partitions).
If your changes are well distributed over the table (which is what you want to distribute the load evenly), this means there aren't many changes written to any single shard at any point in time. So for a batch window of a few seconds (15 in your case), the actual batch size may be low. If the changes are focused on some partitions, you should see a relatively high variance in the batch size (unfortunately, there's no metric for it afaik).
The only thing you can control directly here (without larger architectural changes) is the batch window. If you increase that, you should see larger batch sizes at the expense of higher processing latency.
You could consider having a lambda function write these changes to a kinesis firehose delivery stream, configure it to write records in batches to S3, and have another Lambda respond to objects written to S3. This would increase your latency again, but allows for much larger batch sizes.
(I also considered writing to SQS, but the max batch size you can request from there is 10.)

How does DynamoDB adaptive scaling rebalance partitions?

In the DynamoDB doc, it is written:
If your application drives disproportionately high traffic to one or
more items, adaptive capacity rebalances your partitions such that
frequently accessed items don't reside on the same partition.
My question is:
what exactly is meant by “rebalance” ?
Are some items copied to a new partition and removed from the original one ?
Does this process impact performance?
How long does it take ?
Items are split across two new partitions. The split initiates when the database decides there's been enough sustained traffic in a spread pattern where a split would be beneficial, and then the split itself takes a few minutes. In testing with on-demand tables (where I created synthetic sustained traffic) I've seen the throughput double and then double again, repeating about every 15 minutes.

Is it possible to write to DynamoDB only when spare capacity is available?

I am working on an application which receives very predictable, heavy traffic during working hours. Users typically interact with the app for about 40 minutes at a time. DynamoDB table A receives a steady stream of writes throughout user sessions and handles things without difficulty. We attempt to write a large amount of data to table B at the end of each session, however, and early in the day this can result in throttling. Our tables are billed on-demand (no, this is not something I am able to change), but the sudden spike in writes still causes throttling, which is expected.
The data being written to table A is both critical and time sensitive. The data going to table B is critical and must not be lost, but delays in data availability from table B on the order of a few hours is acceptable, but not ideal. So I'm looking for a way to say "please write this to the table ASAP, but only as long as it won't cause throttling". Provisioning for the expected capacity is not an option (don't ask). An SQS queue with a long message delay doesn't really fit the bill because (a) 15 minutes may not be long enough and (b) it doesn't meet the "ASAP" part of the story. I've considered pre-warming the table, but that's just cludgy.
So... you take all the expected ways to handle this that were designed and provided by AWS then say you can't use them. That... doesn't leave you much options.
You're pretty much left with designing some custom architecture. Throttling, provisioning, burst provisioning, on demand, and all are all part of the package for handling these kinds of bursts. If you can't use them, then you'll have to do something like write the entry as a json to an s3 bucket and have some cron event pick them up in an hour or something one a time and batch write them to the table.
You may want to take a look at how your table is arranged. If you are having to make a lot of writes all at once (ie, because you have to duplicate data through multiple PK/SK combinations in order to be able to recall it with a single query) then an RDS may be better suited for the task at hand. Dynamo is more for quick and snappy queries and not really for extended data logging or storage.
Here's the secret to DDB on-demand...
From the page you linked to
For new on-demand tables, you can immediately drive up to 4,000 write
request units or 12,000 read request units, or any linear combination
of the two. For an existing table that you switched to on-demand
capacity mode, the previous peak is half the previous provisioned
throughput for the table—or the settings for a newly created table
with on-demand capacity mode, whichever is higher. For more
information, see Initial throughput for on-demand capacity mode.
And the Inital throughput for on-demand capacity mode page says:
Initial Throughput for On-Demand Capacity Mode If you recently
switched an existing table to on-demand capacity mode for the first
time, or if you created a new table with on-demand capacity mode
enabled, the table has the following previous peak settings, even
though the table has not served traffic previously using on-demand
capacity mode:
Newly created table with on-demand capacity mode: The previous peak is
2,000 write request units or 6,000 read request units. You can drive
up to double the previous peak immediately, which enables newly
created on-demand tables to serve up to 4,000 write request units or
12,000 read request units, or any linear combination of the two.
Existing table switched to on-demand capacity mode: The previous peak
is half the maximum write capacity units and read capacity units
provisioned since the table was created, or the settings for a newly
created table with on-demand capacity mode, whichever is higher. In
other words, your table will deliver at least as much throughput as it
did prior to switching to on-demand capacity mode.
The key thing to realize is that DDB on-demand "peaks" are never lowered..
So if you have a table that at some point peaked at 20K WCU, you can scale cleanly from 1-20K without throttling.
In other words, you shouldn't continue to see throttling in an app unless you hit a new peak.
You can also artificially set the peak by changing the table to provisioned at double the expected peak. Then when you convert it back to on-demand, you'll have a "peak" set for half the provisioned capacity.

Amazon DynamoDB read latency while writing

I have an Amazon DynamoDB table which is used for both read and write operations. Write operations are performed only when the batch job runs at certain intervals whereas Read operations are happening consistently throughout the day.
I am facing a problem of increased Read latency when there is significant amount of write operations are happening due to the batch jobs. I explored a little bit about having a separate read replica for DynamoDB but nothing much of use. Global tables are not an option because that's not what they are for.
Any ideas how to solve this?
Going by the Dynamo paper, the concept of a read-replica for a record or a table does not exist in Dynamo. Within the same region, you will have multiple copies of a record depending on the replication factor (R+W > N) where N is the replication factor. However when the client reads, one of those records are returned depending on the cluster health.
Depending on how the co-ordinator node is chosen either at the client library or at the cluster, the client can only ask for a record (get) or send a record(put) to either the cluster co-ordinator ( 1 extra hop ) or to the node assigned to the record (single hop to record). There is just no way for the client to say 'give me a read replica from another node'. The replicas are there for fault-tolerance, if one of the nodes containing the master copy of the record dies, replicas will be used.
I am researching the same problem in the context of hot keys. Every record gets assigned to a node in Dynamo. So a million reads on the same record will lead to hot keys, loss of reads/writes etc. How to deal with this ? A read-replica will work great because I can now manage the hot keys at the application and move all extra reads to read-replica(s). This is again fraught with issues.

AWS DynamoDB: What does the graph implies? What needs to be done? Few of my btachwrite (delete request) failed

Can somebody tell what needs to be done?
Im facing few issues when I am having 1000+ events.
Few of them are not getting deleted after my process.
Im doing a batch delete through batchwriteitem
Each partition on a DynamoDB table is subject to a hard limit of 1,000 write capacity units and 3,000 read capacity units. If your workload is unevenly distributed across partitions, or if the workload relies on short periods of time with high usage (a burst of read or write activity), the table might be throttled.
It seems You are using DynamoDB adaptive capacity, however, DynamoDB adaptive capacity automatically boosts throughput capacity to high-traffic partitions. However, each partition is still subject to the hard limit. This means that adaptive capacity can't solve larger issues with your table or partition design. To avoid hot partitions and throttling, optimize your table and partition structure.
https://aws.amazon.com/premiumsupport/knowledge-center/dynamodb-table-throttled/
One way to better distribute writes across a partition key space in Amazon DynamoDB is to expand the space. You can do this in several different ways. You can add a random number to the partition key values to distribute the items among partitions. Or you can use a number that is calculated based on something that you're querying on.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-partition-key-sharding.html