DyanamoDB - Global Secondary Index using GetItemRequest using Hash and Range - amazon-web-services

I am trying to use the Java AWS sdk to get a document based on a Global Secondary Index.
Setup as follows:
Hash Key: MyId - Number
Range Key: MyDate - String
Here is my code to:
Map<String, AttributeValue> key = new HashMap<String, AttributeValue>();
key.put("MyId", new AttributeValue().withN("1234"));
key.put("MyDate", new AttributeValue().withS("2014-10-12"));
GetItemRequest go = new GetItemRequest().withTableName(tableName).withKey(key);
GetItemResult result = getDynamoDBClient().getItem(gi);
But this always returns :
The provided key element does not match the schema (Service:
AmazonDynamoDBv2; Status Code: 400
What am I dong wrong?

A few notes, first you talk about GSI but you are doing GetItemRequest by primary key. So perhaps you're missing something in your question.
Did you write in your question the primary key of the table or the GSI definition?
You can only Query by GSI, Get is still based on primary key.

Related

DynamoDb Invalid ConditionExpression due to : present in expression

I want to do conditional putItem call into DynamoDb i.e. don't insert an entry in dynamoDb if the primaryKey(partitionKey + Sort Key already exists). My schema's key look like this:
PartitionKey: abc:def
SortKey:abc:123
To do a conditional I do something like this:
private static final String PK_DOES_NOT_EXIST_EXPR = "attribute_not_exists(%s)";
final String condition = String.format(PK_DOES_NOT_EXIST_EXPR,
record.getPKey() + record.getSortKey);
final PutItemEnhancedRequest putItemEnhancedRequest = PutItemEnhancedRequest
.builder(Record.class)
.conditionExpression(
Expression.builder()
.expression(condition)
.build()
)
.item(newRecord)
.build();
However I run into following error
Exception in thread "main" software.amazon.awssdk.services.dynamodb.model.DynamoDbException: Invalid ConditionExpression: Syntax error; token: ":123", near: "abc:123)" (Service: DynamoDb, Status Code: 400
I am assuming this is because of : present in my condition, because the same expression without : in the key succeeds. Is there a way to fix this?
Your condition should include the name of the partition key attribute, not its value. For example:
attribute_not_exists(pk)
Also, see Uniqueness for composite primary keys for an explanation of why you only need to indicate the partition key attribute name, not both the partition key and the sort key attribute names. So, the following, while not harmful, is unnecessary:
attribute_not_exists(pk) AND attribute_not_exists(sk)

DynamoDB Query with filter

I like to write a dynamoDb query in which I filter for a certain field, sounds simple.
All the examples I find always include the partition key value, which really confuses me, since it is unique value, but I want a list.
I got id as the partition key and no sort key or any other index. I tried to add partner as an index did not make any difference.
AttributeValue attribute = AttributeValue.builder()
.s(partner)
.build();
Map<String, AttributeValue> expressionValues = new HashMap<>();
expressionValues.put(":value", attribute);
Expression expression = Expression.builder()
.expression("partner = :value")
.expressionValues(expressionValues)
.build();
QueryConditional queryConditional = QueryConditional
.keyEqualTo(Key.builder()
.partitionValue("id????")
.build());
Iterator<Product> results = productTable.query(r -> r.queryConditional(queryConditional)
Would appreciate any help. Is there a misunderstandig on my side?
DynamoDB has two distinct, but similar, operations - Query and Scan:
Scan is for reading the entire table, including all partition keys.
Query is for reading a specific partition key - and all sort key in it (or a contiguous range of sort key - hence the nickname "range key" for that key).
If your data model does not have a range key, Query is not relevant for you - you should use Scan.
However this means that each time you call this query, the entire table will be read. Unless your table is tiny, this doesn't make economic sense, and you should reconsider your data model. For example, if you frequently look up results by the "partner" attribute, you can consider creating a GSI (global secondary index) with "partner" as its partition key, allowing you to quickly and cheapy fetch the list of items with a given "partner" value without scanning the entire table.

DynamoDB BatchGetItemRequest without providing Primary Key

AWS DynamoDB table has :
Client (Primary Key),
folder_location (non-key attribute),
script_name (non-key attribute)
I want to retrieve records using Client and folder_location attributes using BatchGetItemRequest.
But getting below error:
Failed to retrieve items.com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException;
Is there a way to do with BatchGetItemRequest only ?
You could instead use a Query or Scan operation and specify a Filter Expression to limit results based on the folder_location.
To use BatchGetItemRequest you must provide the key, per https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchGetItem.html:
The BatchGetItem operation returns the attributes of one or more items
from one or more tables. You identify requested items by primary key.
Keys - An array of primary key attribute values that define specific
items in the table. For each primary key, you must provide all of the
key attributes. For example, with a simple primary key, you only need
to provide the partition key value. For a composite key, you must
provide both the partition key value and the sort key value.
I recommend you double check that the key you are providing matches the key defined in the table (name and data type). If so, then try one of those other options with a FilterExpression.

Querying a Global Secondary Index of a DynamoDB table without using the partition key

I have a DynamoDB table with partition key as userID and no sort key.
The table also has a timestamp attribute in each item. I wanted to retrieve all the items having a timestamp in the specified range (regardless of userID i.e. ranging across all partitions).
After reading the docs and searching Stack Overflow (here), I found that I need to create a GSI for my table.
Hence, I created a GSI with the following keys:
Partition Key: userID
Sort Key: timestamp
I am querying the index with Java SDK using the following code:
String lastWeekDateString = getLastWeekDateString();
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("user table");
Index index = table.getIndex("userID-timestamp-index");
QuerySpec querySpec = new QuerySpec()
.withKeyConditionExpression("timestamp > :v_timestampLowerBound")
.withValueMap(new ValueMap()
.withString(":v_timestampLowerBound", lastWeekDateString));
ItemCollection<QueryOutcome> items = index.query(querySpec);
Iterator<Item> iter = items.iterator();
while (iter.hasNext()) {
Item item = iter.next();
// extract item attributes here
}
I am getting the following error on executing this code:
Query condition missed key schema element: userID
From what I know, I should be able to query the GSI using only the sort key without giving any condition on the partition key. Please help me understand what is wrong with my implementation. Thanks.
Edit: After reading the thread here, it turns out that we cannot query a GSI with only a range on the sort key. So, what is the alternative, if any, to query the entire table by a range query on an attribute? One suggestion I found in that thread was to use year as the partition key. This will require multiple queries if the desired range spans multiple years. Also, this does not distribute the data uniformly across all partitions, since only the partition corresponding to the current year will be used for insertions for one full year. Please suggest any alternatives.
When using dynamodb Query operation, you must specify at least the Partition key. This is why you get the error that userId is required. (In the AWS Query docs)
The condition must perform an equality test on a single partition key value.
The only way to get items without the Partition Key is by doing a Scan operation (but this wont be sorted by your sort key!)
If you want to get all the items sorted, you would have to create a GSI with a partition key that will be the same for all items you need (e.g. create a new attribute on all items, such as "type": "item"). You can then query the GSI and specify #type=:item
QuerySpec querySpec = new QuerySpec()
.withKeyConditionExpression(":type = #item AND timestamp > :v_timestampLowerBound")
.withKeyMap(new KeyMap()
.withString("#type", "type"))
.withValueMap(new ValueMap()
.withString(":v_timestampLowerBound", lastWeekDateString)
.withString(":item", "item"));
Always good solution for any customised querying requirements with DDB is to have right primary key scheme design for GSI.
In designing primary key of DDB, the main principal is that hash key should be designed for partitioning entire items, and sort key should be designed for sorting items within the partition.
Having said that, I recommend you to use year of timestamp as a hash key, and month-date as a sort key.
At most, the number of query you need to make is just 2 at max in this case.
you are right, you should avoid filtering or scanning as much as you can.
So for example, you can make the query like this If the year of start date and one of end date would be same, you need only one query:
.withKeyConditionExpression("#year = :year and #month-date > :start-month-date and #month-date < :end-month-date")
and else like this:
.withKeyConditionExpression("#year = :start-year and #month-date > :start-month-date")
and
.withKeyConditionExpression("#year = :end-year and #month-date < :end-month-date")
Finally, you should union the result set from both queries.
This consumes only 2 read capacity unit at most.
For better comparison of sort key, you might need to use UNIX timestamp.
Thanks

Dynamo DB : Deleting entries within a table with HashKey + Range

I have DynamoDB table with Hashkey + RangeKey. I am facing the following error while deleting an entry for a given HashKey and RangeKey.
Tried following approaches
Used a DynamoDBMapper to get the record (Obj) from DB using DynamoDBQueryExpression that includes both HashKey + RangeKey in KeyCondition. Performed dynamoDBMapper.delete(Obj).
Referred to other post - DynamoDb: Delete all items having same Hash Key and tried
HashMap<String, AttributeValue> eav = new HashMap<String, AttributeValue>();
eav.put(":v1", new AttributeValue().withS(value));
DynamoDBQueryExpression<DocumentTable> queryExpression = new DynamoDBQueryExpression<DocumentTable>()
.withKeyConditionExpression("documentId = :v1")
.withExpressionAttributeValues(eav);
List<DocumentTable> ddbResults = dynamoDBMapper.query(DocumentTable.class, queryExpression);
dynamoDBMapper.batchDelete(ddbResults);
In both the above cases I see the following exception
com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: I12MUB0FSQNAQT6AH0RHE1B12JVV4KQNSO5AEMVJF66Q9ASUAAJG)
Please help in case someone had similar issues.
Is it not recommended to have a composite Key ( as HashKey and SortKey )?
You tried to specify an item by giving a value for the key documentId. The error is trying to tell you that documentId isn't the name of the key in your table.
Replace documentId with the name of the key in you table.