AWS DynamoDB table has :
Client (Primary Key),
folder_location (non-key attribute),
script_name (non-key attribute)
I want to retrieve records using Client and folder_location attributes using BatchGetItemRequest.
But getting below error:
Failed to retrieve items.com.amazonaws.services.dynamodbv2.model.AmazonDynamoDBException: The provided key element does not match the schema (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException;
Is there a way to do with BatchGetItemRequest only ?
You could instead use a Query or Scan operation and specify a Filter Expression to limit results based on the folder_location.
To use BatchGetItemRequest you must provide the key, per https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchGetItem.html:
The BatchGetItem operation returns the attributes of one or more items
from one or more tables. You identify requested items by primary key.
Keys - An array of primary key attribute values that define specific
items in the table. For each primary key, you must provide all of the
key attributes. For example, with a simple primary key, you only need
to provide the partition key value. For a composite key, you must
provide both the partition key value and the sort key value.
I recommend you double check that the key you are providing matches the key defined in the table (name and data type). If so, then try one of those other options with a FilterExpression.
Related
I have a table in which has a "userId" column (set as a partition key) and a "createdAt" column (set as the sort key) so they form up a composite primary key.
I also need to find the exact row in case I don't have the User ID available, so I made another column "id" and made it as a global secondary index.
In my case, should I make the "id" column the primary key and remove the "userId" as the partition key or will this remove the feature of what "Partitioning" actually does by the DynamoDB?
Similarly, If I need to delete a row from the table, should I send "createdAt" field from the front end to be able to find out the exact row? Does this make sense? Sending the "id" of the row seems more good to me to be able to delete the row.
You probably don't want to put a timestamp in your user primary keys. Why? You'd need to know the exact time the user was created to fetch a user, which is probably not what you want.
Consider using a partition key of USER#<user_id> and a sort key of something predictable, like A or METADATA or USER#<user_id>. This allows you to fetch/delete a user by their ID.
If you have access patterns around fetching users in order of account creation, you can create a GSI with the sort key set to the createdAt attribute.
I have the following JSON in dynamo:
{
cdItem: "123456",
dtItem: "2021-03-01"
}
My hashkey is cdItem.
I would need my dtItem also be a key. So that if I send an item with the same cdItem, but different dtItem, it creates a new record and does not update the existing one.
How can I do this? Or, is it possible to do this?
There are multiple ways you can implement this and they depend on your access patterns.
If you only want to request an item for which you know both the cdItem as well as the dtItem values you could just overload the partition key by concatenating them, e.g. 123456#2021-03-01 that way you could keep your existing table.
A more flexible solution would be using a composite primary key, which is a combination of a partition and a sort key. This requires you to create a new table.
I'd set it up like this:
cdItem (Partition Key)
dtItem (Sort Key
123456
2021-02-27
123456
2021-02-28
123456
2021-03-01
654321
2021-03-01
You'll have to provide both of those attributes on each PutItem request.
You can also call GetItem with both values to retrieve a single item and you can select all dtItem values for a given cdItem value using the Query API as well as do some filtering on the value of dtItem.
I noticed that DynamoDB query/scan only returns documents that contain a subset of the document, just the key columns it appears.
This means I need to do a separate Batch_Get to get the actual documents referenced by those keys.
I am not using a projection expression, and according to the documentation this means the whole item should be returned.1
How do I get query to return the entire document so I don't have to do a separate batch get?
One example bit of code that shows this is below. It prints out found documents, yet they contain only the primary key, the secondary key, and the sort key.
t1 = db.Table(tname)
q = {
'IndexName': 'mysGSI',
'KeyConditionExpression': "secKey= :val1 AND " \
"begins_with(sortKey,:status)",
'ExpressionAttributeValues': {
":val1": 'XXX',
":status": 'active-',
}
}
res = t1.query(**q)
for doc in res['Items']:
print(json.dumps(doc))
This situation is discussed in the documentation for the Select parameter. You have to read quite a lot to find this, which is not ideal.
If you query or scan a global secondary index, you can only request
attributes that are projected into the index. Global secondary index
queries cannot fetch attributes from the parent table.
Basically:
If you query the parent table then you get all attributes by default.
If you query an LSI then you get all attributes by default - they're retrieved from the projection in the LSI if all attributes are projected into the index (so that costs nothing extra) or from the base table otherwise (which will cost you more reads).
If you query or scan a GSI, you can only request attributes that are projected into the index. GSI queries cannot fetch attributes from the parent table.
I've recently started learning DynamoDB and created a table 'Communication' with the following attributes (along with the DynamoDB type):
Primary Key Communication ID (randomly generated seq # or UUID): String
Sort Key User ID: String
Attributes/Columns:
Communication_Mode: String
Communication_Channel: String
Communication_Preference: String (possible values Y/N)
DateTime: Number
Use case: User can choose not to be communicated (Communication_Preference: N) and after a month user may opt for it (Communication_Preference: Y); meaning for the same User ID there can be more than 1 record as PartitionKey is randomly generated number
If I have to query above table and retrieve last inserted record for a specific userid do I need to create Global Secondary Index on DateTime.
Can someone correct me if my understanding is wrong or propose me the best option to meet above requirement. Thanks!
I am trying to query a dynamodb table. When I'm using the begins with operator I'm getting the following error.
{u'Message': u'All queries must have a condition on the hash key, and
it must be of type EQ', u'__type':
u'com.amazon.coral.validate#ValidationException'}
result_set = tb_places.query_2(
place_name__beginswith="ame",
)
Here place_name is a Global Secondary Index
Regardless if you are querying a table or index, the only operator that can be applied to a Hash Key attribute is the EQ. Alternatively, you could use BEGINS_WITH on a Range Key.
For a query on a table, you can have conditions only on the table
primary key attributes. You must provide the hash key attribute name
and value as an EQ condition. You can optionally provide a second
condition, referring to the range key attribute.[...]
For a query on an index, you can have conditions only on the index key
attributes. You must provide the index hash attribute name and value
as an EQ condition. You can optionally provide a second condition,
referring to the index key range attribute.
Source: http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html