Dynamo DB batch operations on single table - amazon-web-services

I've been going through AWS DynamoDB docs and cannot figure out what's the difference between batchGetItem() and Query().
My use case: I have a table which has Id as primary hash key, and attribute values are Name and Marks.
I would like to perform batch query which returns list of names and marks by providing list of Id's which are primary keys.
Should I use batchGetItem() or Query()?

BatchGetItem: Allows to you parallelize "GetItem" requests for languages that don't support parallelism (i.e. javascript). This includes retrieving items from different tables (doesn't support indexes though).
Query: Allows you to page through tables with a Hash-Range schema (where you'll have multiple results associated with a Hash key) and allows you to retrieve items from the indexes on your table. Note you can also add an additional condition on range key in your KeyConditions and add conditions on any non primary key attribute in your QueryFilter.
It seems like that your use case calls for a BatchGetItem request, as you are trying to retrieve items from your base table by way of a Hash key.
Hope that helps!

Related

Compare values in dynamodb during query without knowing values in ExpressionAttributeValues

Is it possible to apply a filter based on values inside a dynamodb database?
Let's say the database contains an object info within a table:
info: {
toDo: x,
done: y,
}
Using the ExpressionAttributeValues, is it possible to check whether the info.toDo = info.done and apply a filter on it without knowing the current values of info.toDo and info.done ?
At the moment I tried using ExpressionAttributeNames so it contains:
'#toDo': info.toDo, '#done': info.done'
and the filter FilterExpression is
#toDo = #done
but I'm retrieving no items doing a query with this filter.
Thanks a lot!
DynamoDB is not designed to perform arbitrary queries as you might be used to in a relational database. It is designed for fast lookups based on keys.
Therefore, if you can add an index allowing you to access the records you look for, you can use it for this new access pattern. For example, if you add an index that uses info.toDo as the partition key and info.done as the sort key. You can then use the index to scan the records with the conditional expression of PK=x and SK=x, assuming that the list of possible values is limited and known.

What should be DynamoDB key schema for time sliding window data get scenario?

What I never understood about DynamoDB is how to design a table to effectively get all data with one particular field lying in some range. For example, time range - we would like to get data created from timestamp1 up to timestamp2. According to keys design, we can use only sort key for such a purpose. However, it automatically means that the primary key should be the same for all data. But according to documentation, it is an anti-pattern of DynamoDB usage. How to deal with the situation? Could be creating evenly distributed primary key and then a secondary key which primary part is the same for all items but sort part is different for all of them be a better solution?
You can use Global Secondary Index which in essence is
A global secondary index contains a selection of attributes from the base table, but they are organized by a primary key that is different from that of the table.
So you can query on other attributes that are unique.
I.e. as it might not be clear what I meant, is that you can choose something else as primary key that is possible to be unique and use a repetetive ID as GSI on which you are going to base your query.
NOTE: One of the widest applications of NoSQL DBs is to store timeseries, which you cannot expect to have a unique identifier as PK, unless you specify the timestamp.

Query hash/range key and local secondary index

Is it possible to Query a DynamoDB table using both the hash & range key AND a local secondary index?
I have three attributes I want to compare against in my query. Two are the main hash and range keys and the third is the range key of the local secondary index.
No, but that shouldn't be necessary based on your description of what you are trying to accomplish.
If you are trying to access an object based on the hash and range key (of the main table) as well as an additional attribute, selecting on only the hash and range of the main table (which is required to return a single record by definition) will return that record.
If your concern is that the third attribute may be a value that you want to ignore the entire record you can use a query filter to have that item filtered out by DynamoDB or you can use logic in your application to ignore that object.

How to perform a range query over AWS dynamoDB

I have a AWS DynamoDB table storing books information, the hash key is book id. There is an attribute for book price.
Now I want to perform a query to return all the books whose price is lower than a certain value. How to do this efficiently, without scanning the whole table?
The query on secondary-index seems only could return a set of entries with the index being a certain value, so I am confused about how to perform a range query efficiently. Thank you very much!
There are two things that maybe you are confusing. The range key with a range on an attribute.
To clarify, in this case you would need a secondary index and when querying the index you would specify a key condition (assuming java and assuming secondary index on value - this in pretty much any sdk supported language)
see http://docs.amazonaws.cn/en_us/AWSJavaSDK/latest/javadoc/index.html?com/amazonaws/services/dynamodbv2/model/QueryRequest.html w/ a BETWEEN condition.
You can't do query of that kind. DynamoDB is sharded across many nodes by hash key, so doing a query without hash key (on all hash keys) is essentially a full scan.
A hack for your case would be to have a hash key with only one value for the whole table, but this is fundamentally wrong because you loose all the pros of using DynamoDB. See hot hash key issue for more info: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GuidelinesForTables.html

SCAN on key attribute in DynamoDB

I need to execute an in query on the key attribute. Since, query doesn't provide in condition, I am planning to use scan. Will scan on key attribute scan the entire table?
Will SCAN on key attribute scan the entire table?
Yes, see Query and Scan in Amazon DynamoDB:
Scan
A scan operation scans the entire table. You can specify filters to
apply to the results to refine the values returned to you, after the
complete scan. Amazon DynamoDB puts a 1MB limit on the scan (the limit
applies before the results are filtered). A scan can result in no
table data meeting the filter criteria.
Specifically, there is no difference between key and non key attributes as far as the Scan API is concerned, i.e. you simply provide the desired attributes by name, regardless of them being used as an attribute constituting the Primary Key as well or not:
AttributesToGet
Array of Attribute names. If attribute names are not specified then
all attributes will be returned. If some attributes are not found,
they will not appear in the result.
wouldn't batchGetItem work for you?