I have a lambda function that make some requests on DynamoDB.
var ddb = new AWS.DynamoDB({apiVersion: '2012-08-10'});
const lookupminutes = 10;
var LookupDate = new Date(Date.now() - 1000 * lookupminutes);
params = {
TableName: TableName,
IndexName: "requestdate-index",
KeyConditionExpression: "requestdate > :startdate",
ExpressionAttributeValues: {":startdate": {S: LookupDate.toISOString()}
},
ProjectionExpression: "id, requestdate"
};
var results = await ddb.query(params).promise();
When running the lambda function, I'm getting the error : "Query key condition not supported" in the line that runs the query against DynamoDB
The field requestdate is stored in the table as a string.
Does anyone know what am I doing wrong please ?
Thanks.
You cannot use anything other than an equals operator on a partition key:
params = {
TableName: TableName,
IndexName: "requestdate-index",
KeyConditionExpression: "requestdate = :startdate",
ExpressionAttributeValues: {":startdate": {S: LookupDate.toISOString()}},
ProjectionExpression: "id, requestdate"
};
If you need all of the data back within the last 10 mins then they you have two choices, both of which are not very scalable, unless you shard your key (1a):
Put all the data in your index under the same partition key with sort key being timestamp. Then use KeyConditionExpression like:
gsipk=1 AND timestamp> 10mins
As all of the items are under the same partition key, the query will be efficient but at the cost of scalability as you will essentially bottleneck your throughput to 1000WCU.
1a. And probably the best option if you need scale beyond 1000 WCU is to do just as above except use a random number for the partition key (within a range). For example range = 0-9. That would give us 10 unique partition keys allowing us to scale to 10k WCU, however would require us to request 10 Query in parallel to retrieve the data.
Use a Scan with FilterExpression on the base table. If you do not want to place everything under the same key on the GSI then you can just Scan and add a filter. This becomes slow and expensive as the table grows.
Related
So I have this notifications table with the following columns:
PK: (which stores the userId)
sentAt: (which stores the date the notifications was sent)
data: (which stores the data of the notification)
Read: (a boolean value which tells if the user has read the specific notification)
I wanted to create a GSI to get all the notification from a specific user that are not read (Read: False)
So the partition key would be userId and the sort key would be Read but the issue here is that I cannot give a boolean value to the sort key to be able to query the users that have not read the notifications.
This works with scan but that is not the result I am trying to achieve. Can anyone help me on this? Thanks
const params ={
TableName: await this.configService.get('NOTIFICATION_TABLE'),
FilterExpression: '#PK = :PK AND #Read = :Read',
ExpressionAttributeNames: {
'#PK': 'PK',
'#Read': 'Read',
},
ExpressionAttributeValues: {
':PK': 'NOTIFICATION#a8a8e4c7-cab0-431e-8e08-1bcf962358b8',
':Read': true, *//this is causing the error*
},
};
const response = await this.dynamoDB.scan(params).promise();
Yes, we cannot have bool type value to be used as DynamoDB Partition Key or Sort Key.
Some alternatives you could actually consider:
Create a GSI with only Partition Key, gsi-userId. When you do the query, you can query with userId and filter by Read. This will at least help you in saving some costs as you do not need to scan the whole table. However, be aware of Hot Partitions. Link
Consider changing the Read data type to string instead. E.g. It could be values such as Y or N only. As such, you will be able to create a GSI with gsi-userId-Read and this would fulfill what you need.
I have a table that will contain < 1300 entries at about 600 bytes each. The goal is to display pages of results ordered by epoch date. Right now, for any given search I request the full list of ids using a filtered scan, then handle paging on the UI side. For each page, I pass a chunk of ids to retrieve the full entry (also currently a filtered scan). Ideally, the list of ids would return sorted, but if I understand the docs correctly, only results that have the same partition key are sorted. My current partition key is a uuid, so all entries are unique.
Current Table Configuration
Do I essentially need to use a throwaway key for the partition just to get results returned by date? Maybe the size of my table makes this unreasonable to begin with? Is there a better way to handle this? I have another field, "is_active" that's currently a boolean and could be used for the partition key if I converted it to numeric, but that might complicate my update method. 95% of the time, every entry in the db will be "active", so this doesn't seem efficient.
Scan Index
let params = {
TableName: this.TABLE_NAME,
IndexName: this.INDEX_NAME,
ScanIndexForward: false,
ProjectionExpression: "id",
FilterExpression: filterSqlStatement,
ExpressionAttributeValues: filterValues,
ExpressionAttributeNames: {
"#n": "name"
}
};
let results = await this.DDB_CLIENT.scan(params).promise();
let finalizedResults = results ? results.Items : [];
Given that your dataset is relatively small you might try a fixed partition key with a sort key of the date and the UUID. You'd query by the partition key (which would be a fixed value) and the results would come back sorted. This isn't the best idea with large data sets, but < 1300 is not large.
I have a DynamoDB table called 'frank' with a single GSI. The partition key is called PK, the sort key is called SK, the GSI partition key is called GSI1_PK and the GSI sort key is called GSI1_SK. I have a single 'data' map storing the actual data.
Populated with some test data it looks like this:
The GSI partition key and sort key map directly to the attributes with the same names within the table.
I can run a partiql query to grab the results that are shown in the image. Here's the partiql code:
select PK, SK, GSI1_PK, GSI1_SK, data from "frank"."GSI1"
where
("GSI1_PK"='tesla')
and
(
( "GSI1_SK" >= 'A_VISITOR#2021-06-01-00-00-00-000' and "GSI1_SK" <= 'A_VISITOR#2021-06-20-23-59-59-999' )
or
( "GSI1_SK" >= 'B_INTERACTION#2021-06-01-00-00-00-000' and "GSI1_SK" <= 'B_INTERACTION#2021-06-20-23-59-59-999' )
)
Note how the partiql code references "GSI1_SK" multiple times. The partiql query works, and returns the data shown in the image. All great so far.
However, I now want to move this into a Lambda function. How do I structure a AWS.DynamoDB.DocumentClient query to do exactly what this partiql query is doing?
I can get this to work in my Lambda function:
const visitorStart="A_VISITOR#2021-06-01-00-00-00-000";
const visitorEnd="A_VISITOR#2021-06-20-23-59-59-999";
var params = {
TableName: "frank",
IndexName: "GSI1",
KeyConditionExpression: "#GSI1_PK=:tmn AND #GSI1_SK BETWEEN :visitorStart AND :visitorEnd",
ExpressionAttributeNames :{ "#GSI1_PK":"GSI1_PK", "#GSI1_SK":"GSI1_SK" },
ExpressionAttributeValues: {
":tmn": lowerCaseTeamName,
":visitorStart": visitorStart,
":visitorEnd": visitorEnd
}
};
const data = await documentClient.query(params).promise();
console.log(data);
But as soon as I try a more complex compound condition I get this error:
ValidationException: Invalid operator used in KeyConditionExpression: OR
Here is the more complex attempt:
const visitorStart="A_VISITOR#2021-06-01-00-00-00-000";
const visitorEnd="A_VISITOR#2021-06-20-23-59-59-999";
const interactionStart="B_INTERACTION#2021-06-01-00-00-00-000";
const interactionEnd="B_INTERACTION#2021-06-20-23-59-59-999";
var params = {
TableName: "frank",
IndexName: "GSI1",
KeyConditionExpression: "#GSI1_PK=:tmn AND (#GSI1_SK BETWEEN :visitorStart AND :visitorEnd OR #GSI1_SK BETWEEN :interactionStart AND :interactionEnd) ",
ExpressionAttributeNames :{ "#GSI1_PK":"GSI1_PK", "#GSI1_SK":"GSI1_SK" },
ExpressionAttributeValues: {
":tmn": lowerCaseTeamName,
":visitorStart": visitorStart,
":visitorEnd": visitorEnd,
":interactionStart": interactionStart,
":interactionEnd": interactionEnd
}
};
const data = await documentClient.query(params).promise();
console.log(data);
The docs say that KeyConditionExpressions don't support 'OR'. So, how do I replicate my more complex partiql query in Lambda using AWS.DynamoDB.DocumentClient?
If you look at the documentation of PartiQL for DynamoDB they do warn you, that PartiQL has no scruples to use a full table scan to get you your data: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ql-reference.select.html#ql-reference.select.syntax
To ensure that a SELECT statement does not result in a full table scan, the WHERE clause condition must specify a partition key. Use the equality or IN operator.
In those cases PartiQL would run a scan and use a FilterExpression to filter out the data.
Of course in your example you provided a partition key, so I'd assume that PartiQL would run a query with the partition key and a FilterExpression to apply the rest of the condition.
You could replicate it that way, and depending on the size of your partitions this might work just fine. However, if the partition will grow beyond 1MB and most of the data would be filtered out, you'll need to deal with pagination even though you won't get any data.
Because of that I'd suggest you to simply split it up and run each or condition as a separate query, and merge the data on the client.
Unfortunately, DynamoDB does not support multiple boolean operations in the KeyConditionExpression. The partiql query you are executing is probably performing a full table scan to return the results.
If you want to replicate the partiql query using the DocumentClient, you could use the scan operation. If you want to avoid using scan, you could perform two separate query operations and join the results in your application code.
I have a DynamoDB table where each Item has a key of the name 'DataType'.
Also there is a GSI on this table with this 'DataType' as the HashKey and 'timestamp' as rangeKey.
Around 10 per cent of the table items have the 'DataType' value as 'A'.
I want to scan all the items of this GSI with HashKey fixed as 'A'. Is there any way to perform this using scan/parallel scan? Or do i need to use query on GSI itself to perform this operation?
As per the documentation,
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/dynamodbv2/document/spec/ScanSpec.html
i could not find any way to specify a GSI on which i can scan with HashKey fixed.
Given that you want to only look at items with the Hash key "A", you'll need to use the Query API rather than the Scan API, provide the index name, and query for items in the index that have that partition key.
Here's some sample code using the Node AWS SDK V3 that creates three items in a table with a Global Secondary Index (called GSI1). Two of the items have a GSI1PK value of "orange", while the other has a GSI1PK value of "gold". The query returns the two matches:
const tableName = getFromEnv(`TABLE_NAME`);
const client = new DynamoDBClient({});
async function createItem(
name: string,
PK: string,
SK: string,
GSI1PK: string,
GSI1SK: string,
): Promise<void> {
const item = { PK, SK, GSI1PK, GSI1SK, name };
const putCommand = new PutItemCommand({
TableName: tableName,
Item: marshall(item)
});
await client.send(putCommand);
log(`Created item: ${name} with GSI1PK ${GSI1PK}`);
}
await createItem(`foo`, `fooPK`, `fooSK`, `orange`, `blue`);
await createItem(`bar`, `barPK`, `barSK`, `orange`, `white`);
await createItem(`baz`, `bazPK`, `bazSK`, `gold`, `garnet`);
log(`Waiting 5 seconds, as GSIs don't support consistent reads`)
await wait(5);
const query: QueryCommandInput = {
TableName: tableName,
IndexName: `GSI1`,
KeyConditionExpression: `#pk = :pk`,
ExpressionAttributeNames: {
'#pk': `GSI1PK`,
},
ExpressionAttributeValues: {
':pk': { S: `orange` },
},
}
const result = await client.send(new QueryCommand(query));
log(`Querying GSI1 for "orange"`);
result.Items.forEach((entry) => {
log(`Received: `, unmarshall(entry).name);
});
This produces the output of:
Created item: foo with GSI1PK orange
Created item: bar with GSI1PK orange
Created item: baz with GSI1PK gold
Waiting 5 seconds, as GSIs don't support consistent reads
Querying GSI1 for "orange"
Received: foo
Received: bar
One thing worth noting from this example is that GSIs don't allow consistent reads. So if your use case requires immediate consistency, you'll need to find another solution.
I have a DynamoDB table that contains the following keys:
id (value is a uuid) - this is the primary key
some_other_field - is just a regular key
I'd like to be able to query DynamoDB to get the items where some_other_field equals some value.
In order to do that, does some_other_field need to be a sort key?
Can I instead store this a Document item, instead of a key-value item? I've found no documentation how to do so, though.
I guess you have a DynamoDB table (not item) with the keys:
id - string - call it Partition Key or Hash Key
some_other_field - string|number|blob - call it Sorting Key or Range key or Regular column if it is not in the key
Whatever your case is, I would define a Global Secondary Index with the Partition Key: some_other_field and Projection: KEYS_ONLY.
You can query the index for your the items with some_other_field = VALUE. Thus, you never scan the whole table, you only get what you need.
// There may be some small errors in names, consider that code a hint ;)
const params = {
TableName: 'MY_TABLE_NAME',
IndexName: 'MY_INDEX_NAME',
KeyConditionExpression: '#pk = :pk',
ExpressionAttributeNames: {
'#pk': 'some_other_field', // GSI partition key
},
ExpressionAttributeValues: {
':pk': MY_VALUE,
},
}
This is not the only solution, you can also scan the table with a filter expression to keep the items that match the condition, but it is more expensive than the solution above because it always scan all the table.