Why sometimes the DynamoDB is extremely slow? - amazon-web-services

I am developing an application using DynamoDB. This application is not yet open to the public so only certain employees can access the application.
Generally, the application is very fast and there are no performance issues. Sometimes, however, the application is extremely slow.
At first I suspected that the problem comes from React JS application or from the API but that problem is from DynamoDB.
How can I affirm this?
I tested by stopping Node JS (so the API was offline)
I tested directly in the AWS console in "Explore table items" screens and in "PartiQL editor" screens
And DynamoDB was very very slow and I get this error:
The level of configured provisioned throughput for one or more global secondary indexes of the table was exceeded.
Consider increasing your provisioning level for the under-provisioned global secondary indexes with the UpdateTable API
I cannot understand because no application is running.
So why DynamoDB because slow ?
---> Maybe there is a bug in the API. Engineer are works on that.
But why does the DynamoDB keep running slow when API was offline?
How can I "restart" and/or "stop" DynamoDB service?
Best regards
Update: 2022-09-05 17h42 (Japan Time)
I created two videos to illustrate what I say (Sorry for the delay because to create the videos I had to wait for the database bugs):
Normal Case: DynamoDB is very very fast
https://youtu.be/ayeccV0zk0E
Issue Case: DynamoDB is very very slow
https://youtu.be/1u201N2HV8o
---> On my example, I have only 52 Users so this is bug not normal.
Regards

The error message is giving you a potential cause for your perceived slowness.
I suspect that what you perceive as slowness is because the throughput of the Global Secondary Index your app is reading from is exhausted, and the app (or the AWS SDK) is performing exponential backoff to retry the API call.
The one dimension you scale DynamoDB with aside from the Key schema is Throughput. You decide how many requests per second (it's a bit more complicated than that) DynamoDB can handle, and AWS ensures that load can be served. If you go beyond that, AWS throttles API calls, and you receive the errors.
GSIs have their own throughput that you can manage. I suggest you take a look at the provided metrics to identify where your throughput bottleneck is and adjust the throughput accordingly. If you don't want to deal with throughput at all, switch the table to On-Demand Capacity (Pay per request) and AWS handles that for you at a small premium.

The error message mentions provisioned throughput of a GSI, so it is quite likely that this is your problem:
The DynamoDB GSI documentation https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.ThroughputConsiderations explains that
When you create a global secondary index on a provisioned mode table, you must specify read and write capacity units for the expected workload on that index. The provisioned throughput settings of a global secondary index are separate from those of its base table. A Query operation on a global secondary index consumes read capacity units from the index, not the base table. When you put, update or delete items in a table, the global secondary indexes on that table are also updated. These index updates consume write capacity units from the index, not from the base table.
For example, if you accidentally set a GSI's read provisioning to 1, then you can only do on average one read per second from this GSI. If you do a scan that needs to return 10 items, it may take around 10 seconds to complete. Even if no other application is using the table.
Please read the aforementioned link for the full story on how to provision secondary indexes in DynamoDB.
If this is not your problem, please update your question with details on the provisioned throughput settings of your base table and its GSI.

Related

Is it possible to write to DynamoDB only when spare capacity is available?

I am working on an application which receives very predictable, heavy traffic during working hours. Users typically interact with the app for about 40 minutes at a time. DynamoDB table A receives a steady stream of writes throughout user sessions and handles things without difficulty. We attempt to write a large amount of data to table B at the end of each session, however, and early in the day this can result in throttling. Our tables are billed on-demand (no, this is not something I am able to change), but the sudden spike in writes still causes throttling, which is expected.
The data being written to table A is both critical and time sensitive. The data going to table B is critical and must not be lost, but delays in data availability from table B on the order of a few hours is acceptable, but not ideal. So I'm looking for a way to say "please write this to the table ASAP, but only as long as it won't cause throttling". Provisioning for the expected capacity is not an option (don't ask). An SQS queue with a long message delay doesn't really fit the bill because (a) 15 minutes may not be long enough and (b) it doesn't meet the "ASAP" part of the story. I've considered pre-warming the table, but that's just cludgy.
So... you take all the expected ways to handle this that were designed and provided by AWS then say you can't use them. That... doesn't leave you much options.
You're pretty much left with designing some custom architecture. Throttling, provisioning, burst provisioning, on demand, and all are all part of the package for handling these kinds of bursts. If you can't use them, then you'll have to do something like write the entry as a json to an s3 bucket and have some cron event pick them up in an hour or something one a time and batch write them to the table.
You may want to take a look at how your table is arranged. If you are having to make a lot of writes all at once (ie, because you have to duplicate data through multiple PK/SK combinations in order to be able to recall it with a single query) then an RDS may be better suited for the task at hand. Dynamo is more for quick and snappy queries and not really for extended data logging or storage.
Here's the secret to DDB on-demand...
From the page you linked to
For new on-demand tables, you can immediately drive up to 4,000 write
request units or 12,000 read request units, or any linear combination
of the two. For an existing table that you switched to on-demand
capacity mode, the previous peak is half the previous provisioned
throughput for the table—or the settings for a newly created table
with on-demand capacity mode, whichever is higher. For more
information, see Initial throughput for on-demand capacity mode.
And the Inital throughput for on-demand capacity mode page says:
Initial Throughput for On-Demand Capacity Mode If you recently
switched an existing table to on-demand capacity mode for the first
time, or if you created a new table with on-demand capacity mode
enabled, the table has the following previous peak settings, even
though the table has not served traffic previously using on-demand
capacity mode:
Newly created table with on-demand capacity mode: The previous peak is
2,000 write request units or 6,000 read request units. You can drive
up to double the previous peak immediately, which enables newly
created on-demand tables to serve up to 4,000 write request units or
12,000 read request units, or any linear combination of the two.
Existing table switched to on-demand capacity mode: The previous peak
is half the maximum write capacity units and read capacity units
provisioned since the table was created, or the settings for a newly
created table with on-demand capacity mode, whichever is higher. In
other words, your table will deliver at least as much throughput as it
did prior to switching to on-demand capacity mode.
The key thing to realize is that DDB on-demand "peaks" are never lowered..
So if you have a table that at some point peaked at 20K WCU, you can scale cleanly from 1-20K without throttling.
In other words, you shouldn't continue to see throttling in an app unless you hit a new peak.
You can also artificially set the peak by changing the table to provisioned at double the expected peak. Then when you convert it back to on-demand, you'll have a "peak" set for half the provisioned capacity.

Will a click counter slow down my DynamoDB API?

I want to create a DynamoDB WebAPI. It allows the creation and reading of Posts. Now I would like to implement a click counter that updates the popularity of a post each time a user requests it. For this reason, every time a GET request for a posts comes in, I would change the Post object itself.
But I know that DynamoDB is optimized for reads, not for writes. So updating the object that is being fetched everytime would probably be a problem.
So how can I measure the popularity of posts without slowing down the API itself? I was thinking of generating a random number for every fetch and only updating it if it is below 0.05 or something similar.
But is there a better solution for this?
Dynamo DB isn't "optimized for reads" it's optimized to provide "consistent, single-digit millisecond response times at any scale."
To optimize DDB for reads, you'd want to stick a Amazon DynamoDB Accelerator (DAX) instance in front of it for "faster access with microsecond latency".
In actuality, the DDB read/write performance isn't going to be an issue. In your case the network latency between your app and DDB will be orders of magnitude higher. By making two calls synchronously one after the other you'd be doubling your response time; regardless of what cloud DB you're writing too.
Assuming the data and counter are in the same record, the simple DDB solution in this case would be to not make a call to GetItem() and one to UpdateItem(). Instead, simply call UpdateItem() with an UpdateExpression that uses the ADD expression to add 1 to your counter and the ReturnValues attribute to return either ALL_OLD or ALL_NEW.
Other more complex solutions
assuming you've already got the data for display, do an async call to UpdateItem().
At scale, you might consider disconnecting the counter update from your app. Your app post a SQS message, that's processed by a lambda which could use batch updates to DDB.

Aws Dynamo db performance is slow

For my application I am using free tier aws account I have given 5 read capacity and 5Write capacity(i can’t increase the capacity because they will charge if I increase) to the dynamo db here I am using scan operation. The api is loading in between 10 seconds to 20 seconds.
I have used parallel scan too but the api is loading same time. Is there any alternate service in aws.
click here to see the image
It is not a good idea to use a Scan on a NoSQL database.
DynamoDB is optimize for Query requests. The data will come back very quickly, guaranteed (within the allocated Capacity).
However, when using a Scan, the database must read each item from the database and each item consumes a Read Capacity unit. So, if you have a table with 1000 items, a Query on one item would consume one Unit, whereas a Scan would consume 1000 Units.
So, either increase the Capacity Units (and cost) or, best of all, use a Query rather than a Scan. Indexes can also help.
You might need to re-think how you store your data if you always need to do a Scan.

DynamoDB - limit on number of tables per account

We are working on deploying our product (currently on prem) on AWS and are looking at DynamoDB as a alternative to Cassandra mainly to avoid the devop costs associated with a large number of Cassandra clusters.
The DynamoDB doc says that the per account limit on the number of tables is 256 per region but can be increased by calling AWS support. How much is the max limit for this per account?
Our product is separated into distinct logical units where each such unit will have several tables (say 100). Each customer can have several of such units. Each logical unit can be backed up (i.e. a snapshot taken) and that snapshot can be restored at any time in the future (to overwrite the current content of all tables). The backup/restore performance - time taken to take a snapshot/import old data for all the tables - need to be good - it cannot be several minutes/hrs.
We were thinking of using distinct set of tables for each such logical unit - so that backup/restore is quick using EMR on S3. But if we follow this approach, we will run out of the 256 table number limit even with one customer. Looks like there are 2 options
Create a new account for each such logical unit for each customer. Is this possible? We will have a main corporate account I suppose (I am still learning about this), but can it have a set of sub-accounts for our customers using IAM each of which is considered as an independent AWS account?
Use each table in a true multi-tenant manner - where the primary key contains the customer id + logical unit id. But in this scenario,when using EMR to backup an entire table, we will need to selectively back up specific set of rows/items which may be in millions and this will go on while other write/read operations are going on on a different set of items. Is this feasible in terms of large scale?
Any other thoughts on how to approach this?
Thanks for any info.
I would suggest changing the approach - rather then thinking how to get more tables via creating more accounts.
I would think of how to use less tables.
Having said that - you could contact support and increase the amount of tables for you account.
I think that you will run into a money problem, due to the current pricing model of provisioning throughput per table.
Many people split tables based on time frame.
e.x: this weeks table, last weeks table, then move it to last months table and so on..
This helps when analyzing the data with EMR/Redshift - so you wont have to pull the whole table every time.

Update DynamoDB table for Provisioned Throughput

In the API of DynamoDB there is a way to increase/decrease table Provisioned Throughput but there is some Active mode that needs to be updated, what if there is two scripts that running on the same table at the same time and one of them read and the other update the table, What's going to happened with the one that reading? Is it going to failed?
I think maybe before every reading I can check if the table is on Active mode and if not just wait until it does but each time that I'm Query/Scan the database I need to make this check. Maybe it's not necessary.
Is anyone know about this?
It's not necessary, you can still read from the table when it's been updated.
EDIT:
from http://aws.amazon.com/dynamodb/faqs/
Q: Does Amazon DynamoDB remain available when I ask it to scale up or down by changing the provisioned throughput?
Yes. Amazon DynamoDB is designed to scale its provisioned throughput up or down while still remaining available.
DynamoDB reads are "eventually consistent", so the query/scan may not see the updated rows but the request will not fail. You can request consistent reads if you need them though (though they consume slightly more Read Capacity).
See the docs for more information.