how to get a certain item with secondary index - amazon-web-services

I would like some help; i'm trying to get a certain item from the dynamodb table by calling it with the specific id (primary key) e with a global secondary index called index_book.
The function of interest is as follows:
case "GET /book/{id}":
body = await dynamo
.get({
TableName: "book",
Key: {
id: event.pathParameters.id
}
})
.promise();
break;
The moment I go to call the url with a specific id, so for example /book/7 (where 7 is id)
I get the following error:
"The provided key element does not match the schema"
Can you help me please? I will be very grateful to you.

You cannot do a get item against a GSI, you have to do a query. That's because in a GSI there's no uniqueness constraint and multiple items might have the same partition key / sort key combination.
So switch the get to a query and you'll also need to specify the IndexName.

Related

Dynamodb GSI for boolean value

So I have this notifications table with the following columns:
PK: (which stores the userId)
sentAt: (which stores the date the notifications was sent)
data: (which stores the data of the notification)
Read: (a boolean value which tells if the user has read the specific notification)
I wanted to create a GSI to get all the notification from a specific user that are not read (Read: False)
So the partition key would be userId and the sort key would be Read but the issue here is that I cannot give a boolean value to the sort key to be able to query the users that have not read the notifications.
This works with scan but that is not the result I am trying to achieve. Can anyone help me on this? Thanks
const params ={
TableName: await this.configService.get('NOTIFICATION_TABLE'),
FilterExpression: '#PK = :PK AND #Read = :Read',
ExpressionAttributeNames: {
'#PK': 'PK',
'#Read': 'Read',
},
ExpressionAttributeValues: {
':PK': 'NOTIFICATION#a8a8e4c7-cab0-431e-8e08-1bcf962358b8',
':Read': true, *//this is causing the error*
},
};
const response = await this.dynamoDB.scan(params).promise();
Yes, we cannot have bool type value to be used as DynamoDB Partition Key or Sort Key.
Some alternatives you could actually consider:
Create a GSI with only Partition Key, gsi-userId. When you do the query, you can query with userId and filter by Read. This will at least help you in saving some costs as you do not need to scan the whole table. However, be aware of Hot Partitions. Link
Consider changing the Read data type to string instead. E.g. It could be values such as Y or N only. As such, you will be able to create a GSI with gsi-userId-Read and this would fulfill what you need.

DynamoDB Between query on GSI does not work as expected

It is a jobPosts schema that has a posted_date as one of the attributes. The goal is to query all the job posts between two dates.
Here is the schema for your reference:
{
'job_id': {S: jobInfo.job_id},
'company': {S: jobInfo.company},
'title': {S: jobInfo.title},
'posted_on': {S: jobInfo.posted_on},
}
posted_on' is based on ISO string (2019-11-10T10:52:38.013Z). job_id is the primary key (partition key) and since I need to query the dates, I created GSI(partition key) on posted_on. Now here is the query:
const params = {
TableName : "jobPosts",
IndexName: 'date_for_filter_purpose-index',
ProjectionExpression:"job_id, company, title, posted_on",
KeyConditionExpression: "posted_on BETWEEN :startDate AND :endDate",
ExpressionAttributeValues: {
":startDate": {S: "2019-10-10T10:52:38.013Z"},
":endDate": {S: "2019-11-10T10:52:38.013Z"}
}
};
I have one document in dynamoDB and here it is:
{
job_id:,
company: "xyz",
title: "abc",
posted_on: "2019-11-01T10:52:38.013Z"
}
Now, on executing this, I get the following error:
{
"message": "Query key condition not supported",
"code": "ValidationException",
"time": "2019-11-11T06:15:37.231Z",
"requestId": "J078NON3L8KSJE5E8I3IP9N0IBVV4KQNSO5AEMVJF66Q9ASUAAJG",
"statusCode": 400,
"retryable": false,
"retryDelay": 12.382362030893768
}
I don't know what is wrong with the above query.
Update after Tommy Answer:
I removed the GSI on posted_on and re-created the table with job_id as partition key and posted_on as sort key. I get the following error:
{
"message": "Query condition missed key schema element: job_id",
"code": "ValidationException",
"time": "2019-11-12T11:01:48.682Z",
"requestId": "M9E793UQNJHPN5ULQFJI2NR0BVVV4KQNSO5AEMVJF66Q9ASUAAJG",
"statusCode": 400,
"retryable": false,
"retryDelay": 42.52613025785952
}
As per this SO answer, GSI should be able to query the dates using BETWEEN keyword.
The answer you refer to relates to a query where the partition key has a specific value and the sort key is in a given range. It's analagous to select * from table where status=Z and date between X and Y. That's not what you're trying to do, if I read your question correctly. You want select * from table where date between X and Y. You cannot do this with DynamoDB query - you cannot query a partition key by range.
If you knew that your max range of query dates was on a given day then you could create a GSI with a partition key set to the computed YYYYMMDD value of the date/time and whose sort key was the full date/time. Then you could query with a key condition expression for a partition key of the computed YYYYMMDD and a sort key between X and Y. For this to work, the YYYYMMDD of X and Y would have to be the same.
If you knew that your max range of query dates was a month then you could create a GSI with partition key set to the computed YYYYMM of the date/time and whose sort key was the full date/time. For this to work, the YYYYMM of X and Y would have to be the same.
I guess it's a little counter-intuitive but DynamoDB supports only .eq condition on partition key attributes.
As per KeyConditions Documentation
You must provide the index partition key name and value as an EQ condition. You can optionally provide a second condition, referring to the index sort key.
Furthermore, in Query API Documentation you can find the following
The condition must perform an equality test on a single partition key value.
The condition can optionally perform one of several comparison tests on a single sort key value. This allows Query to retrieve one item with a given partition key value and sort key value, or several items that have the same partition key value but different sort key values.
That explains the error message you are getting.
One of the solutions might be to create a composite primary key with posted_on attribute as the sort key, instead of the GSI. Then, depending on your use case and access pattern, you'll need to figure out which attribute would work best as the partition key.
This blog should help you to choose the right partition key for your schema.

How should I store this in DynamoDB if I want to search by these fields?

I have a DynamoDB table that contains the following keys:
id (value is a uuid) - this is the primary key
some_other_field - is just a regular key
I'd like to be able to query DynamoDB to get the items where some_other_field equals some value.
In order to do that, does some_other_field need to be a sort key?
Can I instead store this a Document item, instead of a key-value item? I've found no documentation how to do so, though.
I guess you have a DynamoDB table (not item) with the keys:
id - string - call it Partition Key or Hash Key
some_other_field - string|number|blob - call it Sorting Key or Range key or Regular column if it is not in the key
Whatever your case is, I would define a Global Secondary Index with the Partition Key: some_other_field and Projection: KEYS_ONLY.
You can query the index for your the items with some_other_field = VALUE. Thus, you never scan the whole table, you only get what you need.
// There may be some small errors in names, consider that code a hint ;)
const params = {
TableName: 'MY_TABLE_NAME',
IndexName: 'MY_INDEX_NAME',
KeyConditionExpression: '#pk = :pk',
ExpressionAttributeNames: {
'#pk': 'some_other_field', // GSI partition key
},
ExpressionAttributeValues: {
':pk': MY_VALUE,
},
}
This is not the only solution, you can also scan the table with a filter expression to keep the items that match the condition, but it is more expensive than the solution above because it always scan all the table.

Dynamodb scan in sorted order

Hi I have a dynamodb table. I want the service to return me all the items in this table and the order is by sorting on one attribute.
Do I need to create a global secondary index for this? If that is the case, what should be the hash key, what is the range key?
(Note that query on gsi must specify a "EQ" comparator on the hash key of GSI.)
Thanks a lot!
Erben
If you know the HashKey, then any query will return the items sorted by Range key. From the documentation:
Query results are always sorted by the range key. If the data type of the range key is Number, the results are returned in numeric order. Otherwise, the results are returned in order of UTF-8 bytes. By default, the sort order is ascending. To reverse the order, set the ScanIndexForward parameter set to false.
Now, if you need to return all the items, you should use a scan. You cannot order the results of a scan.
Another option is to use a GSI (example). Here, you see that the GSI contains only HashKey. The results I guess will be in sorted order of this key (I didn't check this part in a program yet!).
As of now the dynamoDB scan cannot return you sorted results.
You need to use a query with a new global secondary index (GSI) with a hashkey and range field. The trick is to use a hashkey which is assigned the same value for all data in your table.
I recommend making a new field for all data and calling it "Status" and set the value to "OK", or something similar.
Then your query to get all the results sorted would look like this:
{
TableName: "YourTable",
IndexName: "Status-YourRange-index",
KeyConditions: {
Status: {
ComparisonOperator: "EQ",
AttributeValueList: [
"OK"
]
}
},
ScanIndexForward: false
}
The docs for how to write GSI queries are found here: http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.Querying
Approach I followed to solve this problem is by creating a Global Secondary Index as below. Not sure if this is the best approach but posting it if it is useful to someone.
Hash Key | Range Key
------------------------------------
Date value of CreatedAt | CreatedAt
Limitation imposed on the HTTP API user to specify the number of days to retrieve data, defaults to 24 hr.
This way, I can always specify the HashKey as Current date's day and RangeKey can use > and < operators while retrieving. This way the data is also spread across multiple shards.

DynamoDB: How to perform conditional write to enforce unique Hash + Range key

I am using DynamoDB to store events.
They are stored in 1 event table with a hash key 'Source ID' and a range key 'version'. Every time a new event occurs for a source, i want to add a new item with the source ID and an increased version nr.
Is it possible to specify a conditional write so that a duplicate item (same hash key and same range key) can never exist? And if so, how would you do this?
I have done this successfully for tables with just a Hash Key:
Map<String, ExpectedAttributeValue> expected = new HashMap<String, ExpectedAttributeValue>();
expected.put("key", new ExpectedAttributeValue().withExists(false));
But not sure how to handle hash + range keys....
I don't know Java SDK well but you can specify "Exist=False" on both the range_key and the hash_key.
Maybe a better idea could be to use a timestamp instead of a version number ? Otherwise, there are also techniques to generate unique ids.
I was trying to enforce a unique combination of hash and range keys and came across this post. I found that it didn't completely answer my question but certainly pointed me in the right direction. This is an attempt to tidy up the loose ends.
It seems that DynamoDB actually enforces a unique combination of hash and range key by design. I quote
"All items in the table must have a value for the primary key attribute and Amazon DynamoDB ensures that the value for that name is unique"
from http://aws.amazon.com/dynamodb/ under the section with the heading Primary Key.
In my own tests using putItem with the aws-sdk for nodejs I was able to post two identical items without generating an error. When I checked the database, only one item was actually inserted. It seems that the second call to putItem with the same hash and range key combination is treated like an update to the original item.
I too received the error "Cannot expect an attribute to have a specified value while expecting it to not exist" when I tried to set the exist=false option on the hash key and range key with the values set. To resolve this error, I removed the value under the expected hash and range key and it started to generate a validation error when I tried to insert the same key twice.
So, my insert command looks like this (will be different for Java, but hopefully you get the idea)
{ "TableName": "MyTableName",
"Item" : {
"HashKeyFieldName": {
"S": HashKeyValue
},
"RangeKeyFieldName": {
"N": currentTime.getTime().toString()
},
"OtherField": {
"N": "61404032632"
}
},
"Expected": {
"HashKeyFieldName" : { "Exists" : false},
"RangeKeyFieldName" : { "Exists" : false}
}
}
Whereas originally I was trying to do a conditional insert to check if there was a hash value and range value the same as what I was trying to insert, now I just need to check if the HashField and RangeField exist at all. If they exist, that means I am updating an item rather than inserting.