DynamoDB equivalent of SQL's key or attribute 'not in("value1", "value2"...)'

DynamoDB equivalent of SQL's key or attribute 'not in("value1", "value2"...)' - amazon-web-services

I am trying to run some dynamoDB operations with AWS.DynamoDB.DocumentClient from aws-sdk module but I am unable to find an easy solution to select items where an attribute is not equals to an array of values.
e.g
attribute <> ["value1", "value2]
This is equivalent to a simple typical SQL operation in the form of:
select * from sometable where attribute not in("value1", "value2"...);
After trying out different ScanFilter and QueryFilter following the documentation here, it seems that the AttributeValueList for NE and NOT_CONTAINS does not accept multiple values.
I wish to arrive at the results as shown below without having to define multiple 'AND' queries
I have since arrived at this solution but it seems clumsy and I would have to write logic to create the filter condition string and ExpressionAttributeValues as the filter condition is dynamic.
FilterExpression: 'answer <> :answer1 AND answer <> :answer2',
ExpressionAttributeValues : {
':answer1' : "test1",
':answer2' : "test2"
}
I have therefore, 2 questions:
Is there a better way of doing this?
Is there a length limit to the string of KeyConditionExpression? I
am very sure there is but I cannot seem to find information with
regards to this.

There is no other way to achieve what you need. If all these values have something in common, and you know it in the write time, you can insert them with some kind of prefix and create GSI where they will be a sort key. In this case you'll be able to query them by prefix in the key condition expression. Otherwise, what you've suggested is your only option.
4KB for all of the expressions combined. As described in the Expression Parameters:
Expression parameters include ProjectionExpression, ConditionExpression, UpdateExpression, and FilterExpression.
The maximum length of any expression string is 4 KB. For example, the size of the ConditionExpression a=b is 3 bytes.

Related

Scan operation with FilterExpression having multiple conditions with "and" operator

I am writing a lambda function in Go and using DynamoDB as my database.
I need to write a scan operation with multiple conditions (e.g. field1 = value1 and field2 = value2 and field3 = value3).
I am creating a FilterExpression string based on how many parameters/conditions are supplied by the user.
My filter expression is as below:
(#field1 = :field1Val) and (#field2 = :field2Val)
I am also providing the ExpressionAttributeNames and the ExpressionAttributeValues in the maps to the scan operation input. However, I am not getting any results (count = 0).
If I specify only one condition or if I use "or" operator instead of "and" operator, I get the results.
Looks like the second condition (#field2 = :field2Val), even if I use any field ( field3, field4, etc.) is always resulting in "false".
Any pointers?
Where do I see the logs of this query/scan operation?

I got the problem.
The filter condition string is correct -
(#field1 = :field1Val) and (#field2 = :field2Val)
I was iterating in a loop to find out which search parameters are specified by the user.
There was a mistake in the code, I was using the same variable name for all the attribute names.
attributeName := "field1"
attributeNamemap["#field1"] = &attributeName
This "attributeName" field was used for all the search parameters.
This was causing the problem, I used different variables and it started working.

Google Data Studio Calculated Field by Extracting String from Event Label Values

I'm trying to use the CASE statement to output string values for an Event Label field using RegEx to produce a table that shows the number of events for each field value. So, if I'm looking for foobar, and other string values separately, within values for Event Label; it may either stand alone or be part of a URL like so:
|[object HTMLLabelElement] | Foobar |
/images/foobar-26.svg
It seems REGEXP_EXTRACT might suit this the best:
CASE WHEN REGEXP_EXTRACT(Event Label, '.(?i)foobar.') THEN Foobar
However, the table produced using the calculated field as the dimension only contains a blank row that seems to be the sum of the number of events.
What am I missing?

I think you need to use REGEXP_MATCH not REGEXP_EXTRACT, given your existing syntax, or to change the syntax to a straight REGEXP_EXTRACT without the CASE element.

Error in KeyConditionExpression when using contains on partition key

I have Tags as partition key in my table, and when I am trying to query I am getting AttributeError.
Below is my code:
kb_table = boto3.resource('dynamodb').Table('table_name')
result = kb_table.query(
KeyConditionExpression=Key('Tags').contains('search term')
)
return result['Items']
Error:
"errorMessage": "'Key' object has no attribute 'contains'"
Basically I want to search through the table where I the field is having that search term. I have achived it using scan but I have read everywhere that we should not use that.
result = kb_table.scan(
FilterExpression="contains (Tags, :titleVal)",
ExpressionAttributeValues={ ":titleVal": "search term" }
)
So I have changed my partition-key to Tags along with a sort-key so that I can achieve this using query but now I am getting this error.
Any idea how to get this working?

In order to use Query you must specify one partition to access, you cannot wildcard a partition or specify multiple keys.
KeyConditionExpression
The condition that specifies the key value(s)
for items to be retrieved by the Query action.
The condition must perform an equality test on a single partition key
value.
Assuming you want to search the whole table for tags, a scan is the most appropriate approach.
EDIT: You can use Query with the exact search term, but im guessing that is not what you want.
kb_table = boto3.resource('dynamodb').Table('table_name')
result = kb_table.query(
KeyConditionExpression=Key('Tags').eq('search term')
)
return result['Items']

dynamodb - scan items where map contains a key

I have a table that contains a field (not a key field), called appsMap, and it looks like this:
appsMap = { "qa-app": "abc", "another-app": "xyz" }
I want to scan all rows whose appsMap contains the key "qa-app" (the value is not important, just the key). I tried something like this but it doesn't work in the way I need:
FilterExpression = '#appsMap.#app <> :v',
ExpressionAttributeNames = {
"#app": "qa-app",
"#appsMap": "appsMap"
},
ExpressionAttributeValues = {
":v": { "NULL": True }
},
ProjectionExpression = "deviceID"
What's the correct syntax?
Thanks.

There is a discussion on the subject here:
https://forums.aws.amazon.com/thread.jspa?threadID=164470
You might be missing this part from the example:
ExpressionAttributeValues: {":name":{"S":"Jeff"}}
However, just wanted to echo what was already being said, scan is an expensive procedure that goes through every item and thus making your database hard to scale.
Unlike with other databases, you have to do plenty of setup with Dynamo in order to get it to perform at it's great level, here is a suggestion:
1) Convert this into a root value, for example add to the root: qaExist, with possible values of 0|1 or true|false.
2) Create secondary index for the newly created value.
3) Make query on the new index specifying 0 as a search parameter.
This will make your system very fast and very scalable regardless of how many records you get in there later on.

If I understand the question correctly, you can do the following:
FilterExpression = 'attribute_exists(#0.#1)',
ExpressionAttributeNames = {
"#0": "appsMap",
"#1": "qa-app"
},
ProjectionExpression = "deviceID"

Since you're not being a bit vague about your expectations and what's happening ("I tried something like this but it doesn't work in the way I need") I'd like to mention that a scan with a filter is very different than a query.
Filters are applied on the server but only after the scan request is executed, meaning that it will still iterate over all data in your table and instead of returning you each item, it applies a filter to each response, saving you some network bandwidth, but potentially returning empty results as you page trough your entire table.
You could look into creating a GSI on the table if this is a query you expect to have to run often.

How to search multiple strings in a string?

I want to check in a powerquery new column if a string like "This is a test string" contains any of the strings list items {"dog","string","bark"}.
I already tried Text.PositionOfAny("This is a test string",{"dog","string","bark"}), but the function only accepts single-character values
Expression.Error: The value isn't a single-character string.
Any solution for this?

This is a case where you'll want to combine a few M library functions together.
You'll want to use Text.Contains many times against a list, which is a good case for List.Transform. List.AnyTrue will tell you if any string matched.
List.AnyTrue(List.Transform({"dog","string","bark"}, (substring) => Text.Contains("This is a test string", substring)))
If you wished that there was a Text.ContainsAny function, you can write it!
let
Text.ContainsAny = (string as text, list as list) as logical =>
List.AnyTrue(List.Transform(list, (substring) => Text.Contains(string, substring))),
Invoked = Text.ContainsAny("This is a test string", {"dog","string","bark"})
in
Invoked

Another simple solution is this:
List.ContainsAny(Text.SplitAny("This is a test string", " "), {"dog","string","bark"})
It transforms the text into a list because there we find a function that does what you need.

If it's a specific (static) list of matches, you'll want to add a custom column with an if then else statement in PQ. Then use a filter on that column to keep or remove the columns. AFAIK PQ doesn't support regex so Alexey's solution won't work.
If you need the lookup to be dynamic, it gets more complicated... but doable you essentially need to
have an ID column for the original row.
duplicate the query so you have two queries, then in the newly created query
split the text field into separate columns, usually by space
unpivot the newly created columns.
get the list of intended names
use list.generate method to generate a list that shows 1 if there's a match and 0 if there isn't.
sum the values of the list
if sum > 0 then mark that row as a match, usually I use the value 1 in a new column. Then you can filter the table to keep only rows with value 1 in the new column. Then group this table on ID - this is the list of ID that contain the match. Now use the merge feature to merge in the first table ensuring you keep only rows that match the IDs. That should get you to where you want to be.

Thanks for giving me the lead. In my own case I needed to ensure two items exist in a string hence I replaced formula as:
List.AllTrue(List.Transform({"/","2017"},(substring) => Text.Contains("4/6/2017 13",substring)))
it returned true perfectly.

You can use regex here with logical OR - | expression :
/dog|string|bark/.test("This is a test string") // retruns true

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

DynamoDB equivalent of SQL's key or attribute 'not in("value1", "value2"...)' - amazon-web-services

Related

Scan operation with FilterExpression having multiple conditions with "and" operator

Google Data Studio Calculated Field by Extracting String from Event Label Values

Error in KeyConditionExpression when using contains on partition key

dynamodb - scan items where map contains a key

How to search multiple strings in a string?

Categories

Resources