AWS Dynamo DB query async with partition key - amazon-web-services

We are using dynamo db scan functionality to fetch all the data from dynamo like this and it works fine:
var myScanConditions = new List<ScanCondition>();
myScanConditions .Add(new ScanCondition("PartitionKey", ScanOperator.BeginsWith, "data"));
var myData= await Context.ScanAsync(myScanConditions ).GetRemainingAsync();
//some code to filter some data from above
in our dynamo db partition key is like
data#rec1
data#rec2
data#rec3
and so on
I wanted to check if we can replace Scan with Query. I tried using the below code by passing scan condition to the Query but looks like its not correct. It returns me nothing.
var myData= await Context.QueryAsync("data", myScanConditions );
So my question is there an option to provide partial text for partition key to the QueryAsync method and still return all records from dynamo. For example in my case as above if I just pass "data" (partial text) to my query async.
Is there a way to do this?
Thanks

Unfortunately, you can't search the partition key with a Query. Queries require and only support the equal operator on the partition key.
If you truly need to search all records in your table, then you must perform a Scan as that is exactly what Scans are for although inspecting all data comes as a cost.
Some ideas to consider:
If you can exclude some data or focus your search on a specific category defined by another field in your dataset, you could add a Global Secondary Index (GSI) to your table that uses a different field as the partition key and the current partition key as the sort key. You could then perform a query on the GSI which will allow you more flexibility on searching the sort key.
You could also create a GSI that only include the partition key in it and no other fields/columns. If you then use this GSI for the Scan, it would improve the performance and the cost of the Scan since only the single key column is searched/loaded instead of the entire table. Once you have the results, you would then be required to do a GetItem or BatchGetItem on the table to pull the full records (if needed).
References:
What is the difference between scan and query in dynamodb? When use scan / query?

You have to have a composite (hash key + sort key) primary key to use query.
If you had "data" as your hash (partition) key, and rec1, rec2, rec3 as the sort key, then you could query just with "data".

Related

Compare values in dynamodb during query without knowing values in ExpressionAttributeValues

Is it possible to apply a filter based on values inside a dynamodb database?
Let's say the database contains an object info within a table:
info: {
toDo: x,
done: y,
}
Using the ExpressionAttributeValues, is it possible to check whether the info.toDo = info.done and apply a filter on it without knowing the current values of info.toDo and info.done ?
At the moment I tried using ExpressionAttributeNames so it contains:
'#toDo': info.toDo, '#done': info.done'
and the filter FilterExpression is
#toDo = #done
but I'm retrieving no items doing a query with this filter.
Thanks a lot!
DynamoDB is not designed to perform arbitrary queries as you might be used to in a relational database. It is designed for fast lookups based on keys.
Therefore, if you can add an index allowing you to access the records you look for, you can use it for this new access pattern. For example, if you add an index that uses info.toDo as the partition key and info.done as the sort key. You can then use the index to scan the records with the conditional expression of PK=x and SK=x, assuming that the list of possible values is limited and known.

How to sort DynamoDB table by a single column?

I'd like to list records from my DDB table ordered by creation date.
My table has an attribute DateCreated.
All examples I can find describe ordering within some partition.
But I want global ordering.
Am I supposed to create an artificial attribute which will have the same value across all records, just to use it as a partition key? E.g. add new attribute GlobalPartition with value 1 to every record in the table, and create a GSI with partition key GlobalPartition and sort key DateCreated. Isn't there a better way?
Thx!
As you noticed, DynamoDB indeed does not have an option to sort items "globally". In other words, there is no way to Scan the database in sorted partition-key order. You can only sort items inside one partition, sorted by the "sort key".
When you have a small amount of data, you can indeed do what you said: Have a single partition with everything in this partition. However it's not clear how practical this approach becomes as your single partition grows - to gigabytes or terabytes, and how well DynamoDB can load-balance when you have just a single partition (I never saw any DynamoDB documentation which answer this question).
So another option is not to have a single partition but rather have a number of them. For example, consider that you want to sort items by date. Now insead of having a single partition, have a partition per month, i.e., the partition key is the month number. Now, if you want to sort everything within a month, you can do it directly, but if you want to get a sorted list of a full year, you need to Query twelve partitions, in order, getting a sorted list in each one and combining it to a sorted list for the full year. So-called time-series databases are often modeled this way.
If you want to sort any data in DynamoDB you need to add Sort Key index on that attribute. If value is not in attribute which maps to tables' sort key, or table does not have sort key, then you need to create GSI and put GSI's sort key on that attribute. You can use LSI too. Any attribute, which maps to "Sort Key" of any index. Table, LSI, GSI.
Check for more details "ScanIndexForward" param of the query request.
If ScanIndexForward is true, DynamoDB returns the results in the order in which they are stored (by sort key value). This is the default behavior. If ScanIndexForward is false, DynamoDB reads the results in reverse order by sort key value, and then returns the results to the client.
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html#API_Query_RequestSyntax
UI has checkbox too for this:
"Global sort" is not possible, while "global" would mean scan operation and it just runs through all rows in database and filters by filters, yet it does not have sorting option. On query on attribute mapped to sort key has ScanIndexForward option to change sort direction.

AWS AppSync response sorted result?

I want to sort $ctx.result.items and reponse the sortedResultI don't want to manually write Velocity Template Language to sort $ctx.result.items in Response Mapping. Is there any better approach to response the sortedResult in AWS AppSync ?
What type of sorting are you looking to do? If it's ascending/descending using a DynamoDB resolver then you can set that on the ScanIndexForward argument for this on the request template: https://docs.aws.amazon.com/appsync/latest/devguide/resolver-mapping-template-reference-dynamodb.html
( if you found a solution I hope this will help someone else )
It depends on how you designed GSI or LSI to your DynamoDB table.
As stated here "DynamoDB builds an unordered hash index on the hash primary key attribute, and a sorted range index on the range primary key attribute."
Here hash index is same as partition key, and range index is same as sort key (old and new terms).
Similar text is stated here - "All items with the same partition key value are stored together, in sorted order by sort key value."
So if you added a GSI or LSI to your DynamoDB table in a way stated above (e.g. all your Products IDs are hash / partition keys and creation times are range / sort keys and you need to sort Products by creation time) you can use something similar to example defined in this page of StackOverflow.

AWS DynamoDB. Querying all hashes IN array

I know BatchGetItems allows for retrieval of multiple hash keys. To save on the read capacity, I like to know if Query provide same functionality via some "IN" keyword I can use? ie, all primary keys will be inserted into an array for Query to search "IN" in the array.
Query doesn't provide what you want. As per the documentation here:
KeyConditionExpression: The condition must perform an equality test on a single partition key value. The condition can also perform one of several comparison tests on a single sort key value. Query can use KeyConditionExpression to retrieve one item with a given partition key value and sort key value, or several items that have the same partition key value but different sort key values.
BatchGetItem is the only option that you have.

Dynamo DB batch operations on single table

I've been going through AWS DynamoDB docs and cannot figure out what's the difference between batchGetItem() and Query().
My use case: I have a table which has Id as primary hash key, and attribute values are Name and Marks.
I would like to perform batch query which returns list of names and marks by providing list of Id's which are primary keys.
Should I use batchGetItem() or Query()?
BatchGetItem: Allows to you parallelize "GetItem" requests for languages that don't support parallelism (i.e. javascript). This includes retrieving items from different tables (doesn't support indexes though).
Query: Allows you to page through tables with a Hash-Range schema (where you'll have multiple results associated with a Hash key) and allows you to retrieve items from the indexes on your table. Note you can also add an additional condition on range key in your KeyConditions and add conditions on any non primary key attribute in your QueryFilter.
It seems like that your use case calls for a BatchGetItem request, as you are trying to retrieve items from your base table by way of a Hash key.
Hope that helps!