What is the sense of Query in AWS DynamoDB - amazon-web-services

In AWS DynamoDB you must specify a `partition key, which make it just work like GetItem ... because partition key is unique so it is supposed to return only one item, so if I know the ID of that item, it make no sense anymore to query! because query meant to be for constrains ...
so can someone give me example where querying one partition key can return multiple items ?
# Create single-attribute primary key table
aws dynamodb create-table --table-name testdb6 --attribute-definitions '[{"AttributeName": "Id", "AttributeType": "S"}]' --key-schema '[{"AttributeName": "Id", "KeyType": "HASH"}]' --provisioned-throughput '{"ReadCapacityUnits": 5, "WriteCapacityUnits": 5}'
# Populate table
aws dynamodb put-item --table-name testdb6 --item '{ "Id": {"S": "1"}, "LastName": {"S": "Lopez"}, "FirstName": {"S": "Maria"}}'
aws dynamodb put-item --table-name testdb6 --item '{ "Id": {"S": "2"}, "LastName": {"S": "Fernandez"}, "FirstName": {"S": "Augusto"}}'
# Query table using only partition attribute
aws dynamodb query --table-name testdb6 --select ALL_ATTRIBUTES --key-conditions '{"Id": {"AttributeValueList": [{"S": "1"}], "ComparisonOperator": "EQ"}}'
You also can only use the EQ Operator for partition key, so using for example BETWEEN or OR or IN is not allowed on partition key
alternative to query there is scan but
scan is expensive (slow)
you can't sort on scan
Update
so I realized I can use sort key, and then in this case partition key can be my table name, so I need to change my vocabularies
Table -> Database
Partition Key -> Table / Collection
Sort Key -> Primary Key / ObjectId
Example : table my-api with partition key -> className and sort key -> id
my-api
className | id | username | title
_User | 0 | "bingo" |
_User | 1 | "mimi" |
_Song | 0 | | "You with me"
it is weird design

If the table has both partition key and sort key and query by partition key alone will give you multiple items.
Partition key - must for using Query API
Sort key - optional while using Query API
Get API:-
For a composite primary key, you must provide values for both the
partition key and the sort key.
So, Get API will always return only one item. Also, there is no filter expression to filter by non-key attributes.

Querying is for finding Items based on Primary or Index Key attributes. Primary Key can have one Partition Key (hash key) and one optional Range Key. If Range key present, then you can have multiple records with same Partition Key. So to perform GetItem operation on this kind of composite primary key you need to specify Range Key with Hash key.
Also, You can specify multiple Global Secondary Indexes (GSI) and Local Secondary Indexes (LSI), through which you can query non-key attributes.
So, Query operation provides you means to find items based on either Primary key, LSI or GSI's hash key and Range key's attributes.
Now, the example you used, is certainly not a good approach to design your schema.
Table != Database
Table == Table and DynamoDB houses all the tables.
User, Song etc needs to be stored in different tables. Then you can specify id, username to further uniquely identify items of your table. Each item can be thought of as record in RDBMS. Read more about choosing primary key and indexes and all information from AWS DynamoDB Developer Guide

Partition key would always return unique result if there is no sort key defined.
Configuring sort key to your table means that the combination between partition key and sort key must be unique.
For best practise defining the structure of your table please refer to this

Related

Dynamodb - how to pick appropriate GSI and LSI

There are 5 columns in my Table "Banners".
id(string) | createdAt(Date) | caption(string) | isActive(binary) | order(Int)
For now, id is the partition key and primary key.
In the future, I might want to do something like getting all banners with isActive =1 and sorted by order.
As far as I understand, GSI is the another option for partition key, LSI is like the second sort key with unchanged partition key in the table.
Should isActive be GSI and order be LSI?
Here is my rule of thumb when it comes to the LSI: only use it when you
need strong read consistency in the secondary index
want to save provisioned RCU and WCU of the secondary index
Otherwise, use the GSI without any hesitation.

How should the Query's format be structured for sending a call with 'Greater than' condition in AWS DynamoDB?

I wanted to run a greater than Query against the primary key of my table. Later I came to know that greater than queries can only be executed on sort keys, and not on primary keys. So, I have now re-designed my table, and here's a screenshot of the new it : (StoreID is the Primary key, & OrderID is the Sort key)
How should I format the Query, if I want to run a query like return those items whose 'OrderID' > 1005?
More particularly, what should I mention in the Query condition to meet my requirements?
Thanks a lot!
You can use the following CLI command to run query "return those items in store with storeid='STR100' whose 'OrderID' > 1005".
aws dynamodb query --table-name <table-name> --key-condition-expression "StoreID = :v1 AND OrderID > :v2" --expression-attribute-values '{":v1": {"S": "STR100"}, ":v2": {"N": 1005}}'

Filtering by non-key field in DynamoDB (aws-cli)

​i'm trying to query a number field in DynamoDB through aws-cli.
It forces me to set the key (userId) to be something, although I want to retrieve all the users where the queriedField equals to 0. this is the syntax:
​
aws dynamodb query
--table-name TableName
--key-condition-expression "userId = :userid"
--filter-expression "mapAttr.queriedField = :num"
--expression-attribute-values '{ ":userid": { "S": "<AccountID>" }, ":num" : { "N": "0" }}'
​
In order to do this query, you will have to scan the whole table with your filter expression.
However, if this is for something that is still in development/design, consider making the 'number field' a top level attribute. That will allow you to create a GSI with a hash key of 'number field', and project the userId attribute to the GSI. Alternatively, you can use Global Secondary Index Write Sharding for Selective Table Queries

Secondary indexes for Dynamodb flexibility

Coming from a SQL background, trying to undestand NoSQL particularly DynamoDB options. Given this schema:
{
"publist": [{
"Author": "John Scalzi",
"Title": "Old Man's War",
"Publisher": "Tor Books",
"Tags": [
"DeepSpace",
"SciFi"
]
},
{
"Author": "Ursula Le Guin",
"Title": "Wizard of Earthsea",
"Publisher": "Mifflin Harcourt",
"Tags": [
"MustRead",
"Fantasy"
]
},
{
"Author": "Cory Doctorow",
"Title": "Little Brother",
"Publisher": "Doherty"
}
]
}
I could have the main table have Author/Title as hash/range keys. A global secondary index could be Publisher/Title. What are the best practices here. How can I get a list of all Authors for a publisher without a total table scan? Cant have a secondary index because Publisher/Author is not unique! Also what are my options if I want all the titles that have a tag of DeepSpace?
EDIT: See RPM & Vikdor answers below. GSI need not be unique, so Publisher/Author is possible. But question remains: is there any workaround for getting all authors by tag, without full table scan?
Cant have a secondary index because Publisher/Author is not unique!
Sure you can, just make sure your Publisher/Title index has Author as a projection - you can then do a query by publisher and just iterate over the results and collect the authors.
When you set up your indexes, you can choose which attributes are projected into the index. Having a Publisher or Publisher/Title key doesn't mean you can only view the Publisher or Publisher and Title, it means you can only query by Publisher or Title, so if you have all attributes or the Author attribute projected into your index, you can get a list of authors by publisher using a query and not a full table scan.
Cant have a secondary index because Publisher/Author is not unique!
The (hash primary key, range primary key) tuple need not be unique for defining a Global Secondary Index. This is only a requirement for the Table level key definitions, i.e. the table cannot have multiple rows with the same values of (hash primary key, range primary key) tuple.
How can I get a list of all Authors for a publisher without a total table scan
You define a GSI on Publisher (Hash PK), Author (Range PK) and use DynamoDB query on the GSI with the Publisher attribute set as the Hash Key Value.
Unlike in SQL where it is possible to create non-clustered indexes on arbitrary columns based on the retrieval patterns, in DynamoDB, as the number of Local Secondary Indexes and Global Secondary Indexes are limited per table, it is important to list down the use cases of retrieving data before identifying the Hash Primary Key and Range Primary Key for a table and leverage Local Secondary Indexes as much as possible, as they use the table's read & write capacity and are strongly consistent (you can choose to run eventually-consistent queries too on LSIs to save capacity). GSIs need their own read & write capacity and are eventually-consistent.
Unfortunately this is not supported currently in DynamoDB. DDB does not provide the capability to query on nested documents alike MongoDB.
In this situation consider modelling data differently and put the nested document in a separate table.
hope this will help.
Cheers,

DynamoDB : List all partition keys

I want to update contains in DynamoDB, for which I need to iterate over existing partition keys present in table.
Is there any way to fetch only list of partition keys using Python. Scan and Query only work on attributes of my table. Is there any way to get all partition key for table ?
If your table uses sort keys in addition to the partition keys (stated differently, if the keys are composite of partition + sort key) then the answer is: no - there is no way to query or scan for just the partition keys. To clarify, you can still scan your table with a projection that returns the keys only, but it will return each primary key multiple times, once for each item that has the same primary key with a different sort key.
If your table schema uses partition keys only (no sort key) then you can write a scan with a projection of only the primary key and therefore, get the list of partition keys as a result.
Overview
To gain all of the partition keys from a table you need to use Scan which will read all of the items in the table. As you are only wanting keys returned, you can use the ProjectionExpression parameter to specify which attributes you would like to be returned.
Scan
The Scan operation returns one or more items and item attributes by accessing every item in a table or a secondary index. To have DynamoDB return fewer items, you can provide a FilterExpression operation.
ProjectionExpression
A string that identifies one or more attributes to retrieve from the specified table or index. These attributes can include scalars, sets, or elements of a JSON document. The attributes in the expression must be separated by commas.If no attribute names are specified, then all attributes will be returned. If any of the requested attributes are not found, they will not appear in the result.
Solution
My Table
pk
sk
col1
col2
col3
123
abc
data
data
data
456
def
data
data
data
789
ghi
data
data
data
Scan with ProjectionExpression
aws dynamodb scan \
--table-name MusicCollection \
--projection-expression "pk, sk"
Response
pk
sk
123
abc
456
def
789
ghi