iteration over dynamodb partition keys - amazon-web-services

I am using AWS.DynamoDB.DocumentClient. I want to iterate over the items and conditionally update them.
I have a table which contains 4000 items. When I scan the table, even if I use ProjectionExpression, I get only 480 results. This is because of scan size limit (1 MB). I'm pretty sure if I get only partition keys, it will be less than 1 MB.
There are some similar questions about scanning specific items. But that's not what I struggle. What can I do to list all partition keys of my table? Thanks.
Here is my scan operation;
docClient.scan({
TableName: "Recipes",
"ProjectionExpression": "#key",
"ExpressionAttributeNames": {
"#key": "id"
}
}, async (err, recipes) => {
console.log("scanned recipes:" + recipes.Items.length)
//output: 477 (but the list have 4000 items)
}

Can you show the scan operation you've tried but isn't working for you?
The following worked for me (my partition key is named PK)
ddbClient.scan(
{
"TableName": "<MY TABLE NAME>",
"ProjectionExpression": "#PK,
"ExpressionAttributeNames": {
"#PK": "PK"
}
}
)
Keep in mind that DynamoDB will consider the entire item size when calculating the 1MB limit, even if you use a projection expression that limits the response to just a few attributes. If your scan result returns a LastEvaluatedKey, you know that DynamoDB is paginating the results.

I found the solution on documentary. ExclusiveStartKey is the answer.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GettingStarted.NodeJs.04.html

Related

How to compare strings in DynamoDB using Lambda NodeJS?

I have a lambda function that make some requests on DynamoDB.
var ddb = new AWS.DynamoDB({apiVersion: '2012-08-10'});
const lookupminutes = 10;
var LookupDate = new Date(Date.now() - 1000 * lookupminutes);
params = {
TableName: TableName,
IndexName: "requestdate-index",
KeyConditionExpression: "requestdate > :startdate",
ExpressionAttributeValues: {":startdate": {S: LookupDate.toISOString()}
},
ProjectionExpression: "id, requestdate"
};
var results = await ddb.query(params).promise();
When running the lambda function, I'm getting the error : "Query key condition not supported" in the line that runs the query against DynamoDB
The field requestdate is stored in the table as a string.
Does anyone know what am I doing wrong please ?
Thanks.
You cannot use anything other than an equals operator on a partition key:
params = {
TableName: TableName,
IndexName: "requestdate-index",
KeyConditionExpression: "requestdate = :startdate",
ExpressionAttributeValues: {":startdate": {S: LookupDate.toISOString()}},
ProjectionExpression: "id, requestdate"
};
If you need all of the data back within the last 10 mins then they you have two choices, both of which are not very scalable, unless you shard your key (1a):
Put all the data in your index under the same partition key with sort key being timestamp. Then use KeyConditionExpression like:
gsipk=1 AND timestamp> 10mins
As all of the items are under the same partition key, the query will be efficient but at the cost of scalability as you will essentially bottleneck your throughput to 1000WCU.
1a. And probably the best option if you need scale beyond 1000 WCU is to do just as above except use a random number for the partition key (within a range). For example range = 0-9. That would give us 10 unique partition keys allowing us to scale to 10k WCU, however would require us to request 10 Query in parallel to retrieve the data.
Use a Scan with FilterExpression on the base table. If you do not want to place everything under the same key on the GSI then you can just Scan and add a filter. This becomes slow and expensive as the table grows.

Scanning With sort_key in DynamoDB

I have a table that will contain < 1300 entries at about 600 bytes each. The goal is to display pages of results ordered by epoch date. Right now, for any given search I request the full list of ids using a filtered scan, then handle paging on the UI side. For each page, I pass a chunk of ids to retrieve the full entry (also currently a filtered scan). Ideally, the list of ids would return sorted, but if I understand the docs correctly, only results that have the same partition key are sorted. My current partition key is a uuid, so all entries are unique.
Current Table Configuration
Do I essentially need to use a throwaway key for the partition just to get results returned by date? Maybe the size of my table makes this unreasonable to begin with? Is there a better way to handle this? I have another field, "is_active" that's currently a boolean and could be used for the partition key if I converted it to numeric, but that might complicate my update method. 95% of the time, every entry in the db will be "active", so this doesn't seem efficient.
Scan Index
let params = {
TableName: this.TABLE_NAME,
IndexName: this.INDEX_NAME,
ScanIndexForward: false,
ProjectionExpression: "id",
FilterExpression: filterSqlStatement,
ExpressionAttributeValues: filterValues,
ExpressionAttributeNames: {
"#n": "name"
}
};
let results = await this.DDB_CLIENT.scan(params).promise();
let finalizedResults = results ? results.Items : [];
Given that your dataset is relatively small you might try a fixed partition key with a sort key of the date and the UUID. You'd query by the partition key (which would be a fixed value) and the results would come back sorted. This isn't the best idea with large data sets, but < 1300 is not large.

Is it possible to use multiple sort values in aws sdk dynamodb batchGetItem?

Is it possible to use multiple sort values in aws sdk dynamodb batchGetItem using one query? My aim is to be able to query the result of multiple sort keys? Or how is an efficient way of doing such a query?
E.g
Partition key / Sort key
A 1
A 2
B 3
E.g input A and 1 and 2
BatchGetItem requires you to specify the full primary key. That means you'd need to specify the partition key and the sort key at the same time.
For example, you could do the following (in pseudocode):
ddbclient.batchGetItem({
{
"RequestItems": {
"YOUR_TABLE_NAME": {
"Keys": [
{
"PK":{"S":"A"},
"SK":{"N": 1},
},
{
"PK":{"S":"A"},
"SK":{"N": 2},
},
]
}
}
})
However, if you do not know the sort key and watch to fetch all the items with Partition Key = "A", you should use the query operation. The query operation does not require you to specify the sort key.
dynamoDbLib.query({
TableName: "YOUR_TABLE_NAME",
KeyConditionExpression: "PK = A",
});

update dynamodb golang with multiple keys

I want to update a dynamodb table with a list of keys. My struct is :
{
ID int,
Code String
}
I have a list of Code values, and i want the dynamodb update when the register is equal with any of Code values:
{ID : 1, Code: "anything"} {ID: 1, Code: "another_code"}
when the table find a ID with value 1 and Code like "anything", or "another_code" which update the value of the register. I did noticed that is not possible, i should use a loop and update each line per time, is true?
return dynamodb.UpdateItemInput{
TableName: &tableName,
Key: attributeObject,
UpdateExpression: &expression,
ConditionExpression: &conditional,
ExpressionAttributeValues: expressionAttributeValues,
ExpressionAttributeNames: expressionAttributeNames,
}
Currently, DynamoDB's Batch operations only support reading or inserting multiple items at a time; updating existing values is not yet supported. So like you suggested, you'll need to loop through each key you want to update and make a separate request.
See also: How to update multiple items in a DynamoDB table at once

dynamodb - scan items where map contains a key

I have a table that contains a field (not a key field), called appsMap, and it looks like this:
appsMap = { "qa-app": "abc", "another-app": "xyz" }
I want to scan all rows whose appsMap contains the key "qa-app" (the value is not important, just the key). I tried something like this but it doesn't work in the way I need:
FilterExpression = '#appsMap.#app <> :v',
ExpressionAttributeNames = {
"#app": "qa-app",
"#appsMap": "appsMap"
},
ExpressionAttributeValues = {
":v": { "NULL": True }
},
ProjectionExpression = "deviceID"
What's the correct syntax?
Thanks.
There is a discussion on the subject here:
https://forums.aws.amazon.com/thread.jspa?threadID=164470
You might be missing this part from the example:
ExpressionAttributeValues: {":name":{"S":"Jeff"}}
However, just wanted to echo what was already being said, scan is an expensive procedure that goes through every item and thus making your database hard to scale.
Unlike with other databases, you have to do plenty of setup with Dynamo in order to get it to perform at it's great level, here is a suggestion:
1) Convert this into a root value, for example add to the root: qaExist, with possible values of 0|1 or true|false.
2) Create secondary index for the newly created value.
3) Make query on the new index specifying 0 as a search parameter.
This will make your system very fast and very scalable regardless of how many records you get in there later on.
If I understand the question correctly, you can do the following:
FilterExpression = 'attribute_exists(#0.#1)',
ExpressionAttributeNames = {
"#0": "appsMap",
"#1": "qa-app"
},
ProjectionExpression = "deviceID"
Since you're not being a bit vague about your expectations and what's happening ("I tried something like this but it doesn't work in the way I need") I'd like to mention that a scan with a filter is very different than a query.
Filters are applied on the server but only after the scan request is executed, meaning that it will still iterate over all data in your table and instead of returning you each item, it applies a filter to each response, saving you some network bandwidth, but potentially returning empty results as you page trough your entire table.
You could look into creating a GSI on the table if this is a query you expect to have to run often.