Error in KeyConditionExpression when using contains on partition key - amazon-web-services

I have Tags as partition key in my table, and when I am trying to query I am getting AttributeError.
Below is my code:
kb_table = boto3.resource('dynamodb').Table('table_name')
result = kb_table.query(
KeyConditionExpression=Key('Tags').contains('search term')
)
return result['Items']
Error:
"errorMessage": "'Key' object has no attribute 'contains'"
Basically I want to search through the table where I the field is having that search term. I have achived it using scan but I have read everywhere that we should not use that.
result = kb_table.scan(
FilterExpression="contains (Tags, :titleVal)",
ExpressionAttributeValues={ ":titleVal": "search term" }
)
So I have changed my partition-key to Tags along with a sort-key so that I can achieve this using query but now I am getting this error.
Any idea how to get this working?

In order to use Query you must specify one partition to access, you cannot wildcard a partition or specify multiple keys.
KeyConditionExpression
The condition that specifies the key value(s)
for items to be retrieved by the Query action.
The condition must perform an equality test on a single partition key
value.
Assuming you want to search the whole table for tags, a scan is the most appropriate approach.
EDIT: You can use Query with the exact search term, but im guessing that is not what you want.
kb_table = boto3.resource('dynamodb').Table('table_name')
result = kb_table.query(
KeyConditionExpression=Key('Tags').eq('search term')
)
return result['Items']

Related

DynamoDB Query with filter

I like to write a dynamoDb query in which I filter for a certain field, sounds simple.
All the examples I find always include the partition key value, which really confuses me, since it is unique value, but I want a list.
I got id as the partition key and no sort key or any other index. I tried to add partner as an index did not make any difference.
AttributeValue attribute = AttributeValue.builder()
.s(partner)
.build();
Map<String, AttributeValue> expressionValues = new HashMap<>();
expressionValues.put(":value", attribute);
Expression expression = Expression.builder()
.expression("partner = :value")
.expressionValues(expressionValues)
.build();
QueryConditional queryConditional = QueryConditional
.keyEqualTo(Key.builder()
.partitionValue("id????")
.build());
Iterator<Product> results = productTable.query(r -> r.queryConditional(queryConditional)
Would appreciate any help. Is there a misunderstandig on my side?
DynamoDB has two distinct, but similar, operations - Query and Scan:
Scan is for reading the entire table, including all partition keys.
Query is for reading a specific partition key - and all sort key in it (or a contiguous range of sort key - hence the nickname "range key" for that key).
If your data model does not have a range key, Query is not relevant for you - you should use Scan.
However this means that each time you call this query, the entire table will be read. Unless your table is tiny, this doesn't make economic sense, and you should reconsider your data model. For example, if you frequently look up results by the "partner" attribute, you can consider creating a GSI (global secondary index) with "partner" as its partition key, allowing you to quickly and cheapy fetch the list of items with a given "partner" value without scanning the entire table.

Querying a Global Secondary Index of a DynamoDB table without using the partition key

I have a DynamoDB table with partition key as userID and no sort key.
The table also has a timestamp attribute in each item. I wanted to retrieve all the items having a timestamp in the specified range (regardless of userID i.e. ranging across all partitions).
After reading the docs and searching Stack Overflow (here), I found that I need to create a GSI for my table.
Hence, I created a GSI with the following keys:
Partition Key: userID
Sort Key: timestamp
I am querying the index with Java SDK using the following code:
String lastWeekDateString = getLastWeekDateString();
AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().build();
DynamoDB dynamoDB = new DynamoDB(client);
Table table = dynamoDB.getTable("user table");
Index index = table.getIndex("userID-timestamp-index");
QuerySpec querySpec = new QuerySpec()
.withKeyConditionExpression("timestamp > :v_timestampLowerBound")
.withValueMap(new ValueMap()
.withString(":v_timestampLowerBound", lastWeekDateString));
ItemCollection<QueryOutcome> items = index.query(querySpec);
Iterator<Item> iter = items.iterator();
while (iter.hasNext()) {
Item item = iter.next();
// extract item attributes here
}
I am getting the following error on executing this code:
Query condition missed key schema element: userID
From what I know, I should be able to query the GSI using only the sort key without giving any condition on the partition key. Please help me understand what is wrong with my implementation. Thanks.
Edit: After reading the thread here, it turns out that we cannot query a GSI with only a range on the sort key. So, what is the alternative, if any, to query the entire table by a range query on an attribute? One suggestion I found in that thread was to use year as the partition key. This will require multiple queries if the desired range spans multiple years. Also, this does not distribute the data uniformly across all partitions, since only the partition corresponding to the current year will be used for insertions for one full year. Please suggest any alternatives.
When using dynamodb Query operation, you must specify at least the Partition key. This is why you get the error that userId is required. (In the AWS Query docs)
The condition must perform an equality test on a single partition key value.
The only way to get items without the Partition Key is by doing a Scan operation (but this wont be sorted by your sort key!)
If you want to get all the items sorted, you would have to create a GSI with a partition key that will be the same for all items you need (e.g. create a new attribute on all items, such as "type": "item"). You can then query the GSI and specify #type=:item
QuerySpec querySpec = new QuerySpec()
.withKeyConditionExpression(":type = #item AND timestamp > :v_timestampLowerBound")
.withKeyMap(new KeyMap()
.withString("#type", "type"))
.withValueMap(new ValueMap()
.withString(":v_timestampLowerBound", lastWeekDateString)
.withString(":item", "item"));
Always good solution for any customised querying requirements with DDB is to have right primary key scheme design for GSI.
In designing primary key of DDB, the main principal is that hash key should be designed for partitioning entire items, and sort key should be designed for sorting items within the partition.
Having said that, I recommend you to use year of timestamp as a hash key, and month-date as a sort key.
At most, the number of query you need to make is just 2 at max in this case.
you are right, you should avoid filtering or scanning as much as you can.
So for example, you can make the query like this If the year of start date and one of end date would be same, you need only one query:
.withKeyConditionExpression("#year = :year and #month-date > :start-month-date and #month-date < :end-month-date")
and else like this:
.withKeyConditionExpression("#year = :start-year and #month-date > :start-month-date")
and
.withKeyConditionExpression("#year = :end-year and #month-date < :end-month-date")
Finally, you should union the result set from both queries.
This consumes only 2 read capacity unit at most.
For better comparison of sort key, you might need to use UNIX timestamp.
Thanks

Scan operation with FilterExpression having multiple conditions with "and" operator

I am writing a lambda function in Go and using DynamoDB as my database.
I need to write a scan operation with multiple conditions (e.g. field1 = value1 and field2 = value2 and field3 = value3).
I am creating a FilterExpression string based on how many parameters/conditions are supplied by the user.
My filter expression is as below:
(#field1 = :field1Val) and (#field2 = :field2Val)
I am also providing the ExpressionAttributeNames and the ExpressionAttributeValues in the maps to the scan operation input. However, I am not getting any results (count = 0).
If I specify only one condition or if I use "or" operator instead of "and" operator, I get the results.
Looks like the second condition (#field2 = :field2Val), even if I use any field ( field3, field4, etc.) is always resulting in "false".
Any pointers?
Where do I see the logs of this query/scan operation?
I got the problem.
The filter condition string is correct -
(#field1 = :field1Val) and (#field2 = :field2Val)
I was iterating in a loop to find out which search parameters are specified by the user.
There was a mistake in the code, I was using the same variable name for all the attribute names.
attributeName := "field1"
attributeNamemap["#field1"] = &attributeName
This "attributeName" field was used for all the search parameters.
This was causing the problem, I used different variables and it started working.

dynamodb - scan items where map contains a key

I have a table that contains a field (not a key field), called appsMap, and it looks like this:
appsMap = { "qa-app": "abc", "another-app": "xyz" }
I want to scan all rows whose appsMap contains the key "qa-app" (the value is not important, just the key). I tried something like this but it doesn't work in the way I need:
FilterExpression = '#appsMap.#app <> :v',
ExpressionAttributeNames = {
"#app": "qa-app",
"#appsMap": "appsMap"
},
ExpressionAttributeValues = {
":v": { "NULL": True }
},
ProjectionExpression = "deviceID"
What's the correct syntax?
Thanks.
There is a discussion on the subject here:
https://forums.aws.amazon.com/thread.jspa?threadID=164470
You might be missing this part from the example:
ExpressionAttributeValues: {":name":{"S":"Jeff"}}
However, just wanted to echo what was already being said, scan is an expensive procedure that goes through every item and thus making your database hard to scale.
Unlike with other databases, you have to do plenty of setup with Dynamo in order to get it to perform at it's great level, here is a suggestion:
1) Convert this into a root value, for example add to the root: qaExist, with possible values of 0|1 or true|false.
2) Create secondary index for the newly created value.
3) Make query on the new index specifying 0 as a search parameter.
This will make your system very fast and very scalable regardless of how many records you get in there later on.
If I understand the question correctly, you can do the following:
FilterExpression = 'attribute_exists(#0.#1)',
ExpressionAttributeNames = {
"#0": "appsMap",
"#1": "qa-app"
},
ProjectionExpression = "deviceID"
Since you're not being a bit vague about your expectations and what's happening ("I tried something like this but it doesn't work in the way I need") I'd like to mention that a scan with a filter is very different than a query.
Filters are applied on the server but only after the scan request is executed, meaning that it will still iterate over all data in your table and instead of returning you each item, it applies a filter to each response, saving you some network bandwidth, but potentially returning empty results as you page trough your entire table.
You could look into creating a GSI on the table if this is a query you expect to have to run often.

How to format the id column with SHA1 digests in Rails application?

Without saving SHA1 digest string in table directly. Is it possible to format the column in select statement ?
For example (Hope you know what i mean):
#item = Item.where(Digest::SHA1.hexdigest id.to_s:'356a192b7913b04c54574d18c28d46e6395428ab')
No, not the way you want it. The hexdigest method you're using won't be available at the database level. You could use database-specific functions though.
For example:
Item.where("LOWER(name) = ?", entered_name.downcase)
The LOWER() function will be available to the database so it can pass the name column to it.
For your case, I can suggest two solutions:
Obviously, store the encrypted field in the table. And then match.
key = '356a192b7913b04c54574d18c28d46e6395428ab'
Item.where(encrypted_id: key)
Iterate over all column values (ID, in your case) and find the one that matches:
all_item_ids = Item.pluck("CAST(id AS TEXT)")
item_id = all_item_ids.find{ |val| Digest::SHA1.hexdigest(val) == key }
Then you could use Item.find(item_id) to get the item or Item.where(id: item_id) to get an ActiveRecord::Relation object.