How to run a 'greater than' query in Amazon DynamoDB? - amazon-web-services

I have a primary key in the table as 'OrderID', and it's numerical which increments for every new item. An example table would look like -
Let's assume that I want to get all orders above the OrderID '1002'. How would I do that?
Is there any possibility of doing this with DynamoDB Query?
Any help is appreciated :)
Thanks!

Unfortunately with this base table you cannot perform a query with a greater than for the partition key.
You have 3 choices:
Migrate to using scan, this will use up your read credits significantly.
Creating a secondary index, you'd want a global secondary index with the sort key becoming your order id. Take a look here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.OnlineOps.html#GSI.OnlineOps.Creating.
Loop over in the application performing a Query or GetItem request from intial value until there are no results left (very inefficient).
The best practice would be to use the GSI if you can as this will be the most performant.

Related

How to sort DynamoDB table by a single column?

I'd like to list records from my DDB table ordered by creation date.
My table has an attribute DateCreated.
All examples I can find describe ordering within some partition.
But I want global ordering.
Am I supposed to create an artificial attribute which will have the same value across all records, just to use it as a partition key? E.g. add new attribute GlobalPartition with value 1 to every record in the table, and create a GSI with partition key GlobalPartition and sort key DateCreated. Isn't there a better way?
Thx!
As you noticed, DynamoDB indeed does not have an option to sort items "globally". In other words, there is no way to Scan the database in sorted partition-key order. You can only sort items inside one partition, sorted by the "sort key".
When you have a small amount of data, you can indeed do what you said: Have a single partition with everything in this partition. However it's not clear how practical this approach becomes as your single partition grows - to gigabytes or terabytes, and how well DynamoDB can load-balance when you have just a single partition (I never saw any DynamoDB documentation which answer this question).
So another option is not to have a single partition but rather have a number of them. For example, consider that you want to sort items by date. Now insead of having a single partition, have a partition per month, i.e., the partition key is the month number. Now, if you want to sort everything within a month, you can do it directly, but if you want to get a sorted list of a full year, you need to Query twelve partitions, in order, getting a sorted list in each one and combining it to a sorted list for the full year. So-called time-series databases are often modeled this way.
If you want to sort any data in DynamoDB you need to add Sort Key index on that attribute. If value is not in attribute which maps to tables' sort key, or table does not have sort key, then you need to create GSI and put GSI's sort key on that attribute. You can use LSI too. Any attribute, which maps to "Sort Key" of any index. Table, LSI, GSI.
Check for more details "ScanIndexForward" param of the query request.
If ScanIndexForward is true, DynamoDB returns the results in the order in which they are stored (by sort key value). This is the default behavior. If ScanIndexForward is false, DynamoDB reads the results in reverse order by sort key value, and then returns the results to the client.
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html#API_Query_RequestSyntax
UI has checkbox too for this:
"Global sort" is not possible, while "global" would mean scan operation and it just runs through all rows in database and filters by filters, yet it does not have sorting option. On query on attribute mapped to sort key has ScanIndexForward option to change sort direction.

Dynamo db sorting

I have a scenario in which I will have to list the incoming requests of a user sorted based on creation time and priority(High, Medium, Low) along with pagination. Is there a way to achieve this in dynamoDb ?
Right now I'm using a secondary Index like userId-createdAt-index which sorts data based on creation time and further sorting the request based on priority separately in the frontend. Somebody please provide a right solution for this.
You're correct to use an index with a sort key. This could also be your primary index, thus reducing how many indexes you need, but that of course depends on whether you already have a sort key on your primary.
DDB guarantees the order of a sorted index, so paging will correctly page by date for you, if you want to reverse the order, add the ScanIndexForward to your query and set it to false.
Your model of query / sort by date at the DB level, then sort by other fields at the application level is normal and correct.
Depending very much on your use-case, another option to consider is querying by priority by using KeyConditions and adding the condition #priority EQ :priority, but I doubt this is what you want.

Is it possible to run a greater than query in AWS DynamoDB?

Let's say I have my DynamoDB table like this, with Order ID as the primary key. :
The Order ID increments by one, everytime I add/put a new item.
Now, I have one number, let's say 1000, and my user wants to get all the items which have Order ID > 1000.
So the items returned would be 1001, 1002, 1003, and so on till the last one.
My requirement is as simple as it seems - but is this thing possible to do with Query method of AWS DynamoDB?
Any help is appreciated :)
Thanks!
There's currently no way to filter on partition key, but I can suggest a way that you can achieve what you want.
You're heading in the right direction with Query which has a "greater than" operator. However, it only operates on the sort key attribute.
With Query, you essentially choose a single partition key, and provide a filter expression that is applied to the sort key of items within that partition.
Since your partition key is currently "Order ID?", you'll need to add a Global Secondary Index to query the way you want.
Without knowing more about your access patterns, I'd suggest you add a Global Secondary Index using "From" as the partition key, which I assume is the user ID. You can then use "Order ID" as the sort key.
my user wants to get all the items which have Order ID > 1000.
With the GSI in place, you can achieve this by doing a query for items where "User ID" is userId and "Order ID" > orderId.
You can find more on query here, details on adding a GSI here, and more info on choosing a partition key here.
No, because Query expects an exact key, and does not allow an expression for the partition key (it does however for the sort key).
What you could use however is a Scan with a FilterExpressions (see Filter Expressions for Scan
and Condition Expressions for the syntax). This reads all records and filters afterwards, so it is not the most effective way.

Is it possible in DynamoDb to get all the items with a given Sort Key?

As primary key I have an id for a recipe and the sort key is the type of food (breakfast, meal, snack, etc).
Is there a way with scan or query to get all the items with a given sort key?
As others have pointed in the comments, you can't query a sort key in the sense that there is no operation that gives a list of items that have the same sort key.
In fact, the whole reason for a sort key is generally to order items in a particular partition.
Putting the two together, what you need is a way to partition the items by the food type and then query on that. Enter the Global Secondary Index (GSI).
With the help of a GSI you can index the data in your table in a way that the food type becomes the partition key, and some other attribute becomes the sort key. Then, getting all the items that match a particular food type becomes possible with a Query.
There are a few things to keep in mind:
a GSI is like another table: it consumes capacity that you will be charged for
a GSI is eventually consistent, meaning changes in the table could take a bit of time before being reflected in the GSI
if you end up creating a GSI where the choice of partition key results in very large partitions, it can lead to throttling (reduced throughput) if any one partition receives a lot of requests
Some more guidelines: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes-general.html
But before you start creating GSIs, consider for a moment the schema of your table: your choice of partition key seems less than ideal. On the one hand, using the recipe id as the partition key is great because it probably results in very good spread of data but on the other hand, you have no ability to use queries on your table without creating GSIs.
Instead of recipe id as the partition key, consider creating a partition key composed of food type, and perhaps another attribute. This way, you can actually query on food type, or perhaps issue several queries to retrieve all items of a particular food type.

How to perform a range query over AWS dynamoDB

I have a AWS DynamoDB table storing books information, the hash key is book id. There is an attribute for book price.
Now I want to perform a query to return all the books whose price is lower than a certain value. How to do this efficiently, without scanning the whole table?
The query on secondary-index seems only could return a set of entries with the index being a certain value, so I am confused about how to perform a range query efficiently. Thank you very much!
There are two things that maybe you are confusing. The range key with a range on an attribute.
To clarify, in this case you would need a secondary index and when querying the index you would specify a key condition (assuming java and assuming secondary index on value - this in pretty much any sdk supported language)
see http://docs.amazonaws.cn/en_us/AWSJavaSDK/latest/javadoc/index.html?com/amazonaws/services/dynamodbv2/model/QueryRequest.html w/ a BETWEEN condition.
You can't do query of that kind. DynamoDB is sharded across many nodes by hash key, so doing a query without hash key (on all hash keys) is essentially a full scan.
A hack for your case would be to have a hash key with only one value for the whole table, but this is fundamentally wrong because you loose all the pros of using DynamoDB. See hot hash key issue for more info: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html