AWS Step Function get all Range keys with a single primary key in DynamoDB - amazon-web-services

I'm building AWS Step Function state machines. My goal is to read all Items from a DynamoDB table with a specific value for the Hash key (username) and any (wildcard) Sort keys (order_id).
Basically something that would be done from SQL with:
SELECT username,order_id FROM UserOrdersTable
WHERE username = 'daffyduck'
I'm using Step Functions to get configurations for AWS Glue jobs from a DynamoDB table and then spawn a Glue job via step function's Map for each dynamodb item (with parameters read from the database).
"Read Items from DynamoDB": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:getItem",
"Parameters": {
"TableName": "UserOrdersTable",
"Key": {
"username": {
"S": "daffyduck"
},
"order_id": {
"S": "*"
}
}
},
"ResultPath": "$",
"Next": "Invoke Glue jobs"
}
But I can't bring the state machine to read all order_id's for the user daffyduck in the step function task above. No output is displayed using the code above, apart from http stats.
Is there a wildcard for order_id ? Is there another way of getting all order_ids? The query customization seems to be rather limited inside step functions:
https://docs.amazonaws.cn/amazondynamodb/latest/APIReference/API_GetItem.html#API_GetItem_RequestSyntax
Basically I'm trying to accomplish what can be done from the command line like so:
$ aws dynamodb query \
--table-name UserOrdersTable \
--key-condition-expression "Username = :username" \
--expression-attribute-values '{
":username": { "S": "daffyduck" }
}'
Any ideas? Thanks

I don't think that is possible with Step functions Dynamodb Service yet.
currently supports get, put, delete & update Item, not query or scan.
For GetItem we need to pass entire KEY (Partition + Range Key)
For the primary key, you must provide all of the attributes. For
example, with a simple primary key, you only need to provide a value
for the partition key. For a composite primary key, you must provide
values for both the partition key and the sort key.
We need to write a Lambda function to query Dynamo and return a map and invoke the lambda function from step.

Related

How do we encrypt the value of a nested dictionary to store in DynamoDB using DynamoDb Encryption Client?

I have the following dictionary
plaintext_item = {
"website": "https://example.com",
"description": "This is a sample data",
"website_username": {
"testuser1": "password12",
"testuser2": "password13",
}
}
In the above dictionary I want to encrypt both the passwords but not their usernames and store it in dynamoDb.
what I tried?
This was my first approach but didn't work
actions = AttributeActions(
default_action=CryptoAction.ENCRYPT_AND_SIGN,
attribute_actions={
"website": CryptoAction.DO_NOTHING,
plaintext_item["website_username"]["testuser1"]: CryptoAction.ENCRYPT_AND_SIGN,
"description": CryptoAction.DO_NOTHING,
}
)
Then I tried this below 2nd approach like how we update nested value in dynamodb, this too didn't work
actions = AttributeActions(
default_action=CryptoAction.ENCRYPT_AND_SIGN,
attribute_actions={
"website": CryptoAction.DO_NOTHING,
"website_username.testuser1": CryptoAction.ENCRYPT_AND_SIGN,
"description": CryptoAction.DO_NOTHING,
})
In both the above cases the whole object is getting encrypted and stored, I looked for some documentation but I am not able to find anything related, I am able to encrypt normal dictionaries like {"a":2,"b":3} but not nested ones.

Enforcing AWS Glue table properties order

I'm using boto3 to update a glue table's table parameters.
I'm doing this with the method update_table.
However, whatever order I push the TableInput parameter in, it gets ignored.
For example I push the following dict:
...
"Parameters": {
"AVTOR": "",
"PROJEKT": "",
"PODROCJE": "",
"CrawlerSchemaDeserializerVersion": "1.0",
"CrawlerSchemaSerializerVersion": "1.0",
...
After updating the table, the table properties are in a random order:
Is there a way to enforce the table properties order with update_table?

Is it possible to iterate through a DynamoDB table within a step function's map state?

Just what the title says, basically. I have read through the documentation:
https://docs.aws.amazon.com/step-functions/latest/dg/connect-ddb.html
This describes how to get a single item of information out of a DynamoDB table from a step function. What I would like to do is iterate through the entire table and start execution of another state machine for each item. Each new state machine would have an individual item as input. I have attempted the following code, which unfortunately is not functional:
{
"StartAt": "OuterFunction",
"States": {
"OuterFunction": {
"Type": "Map",
"Iterator": {
"StartAt": "InnerFunction",
"States": {
"InnerFunction": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:getItem.sync",
"Parameters": {
"StateMachineArn":"other-state-machine-arn",
"TableName": "TestTable"
},
"End": true
}
}
},
"End": true
}
}
}
Is it actually possible to iterate through a DynamoDB table in this way?
You are now able to call DynamoDB directly from step functions. This includes the query and scan operations. With the result, you can then iterate through the items. The one less convenient, caveat is that it does not use the document client, so the results are in the dynamodb json format.
https://docs.aws.amazon.com/step-functions/latest/dg/connect-ddb.html
No, getItem is designed to fetch particular DynamoDB document. You need to write custom Lambda that will .query() or .scan() your table and then use Map step to iterate over results (most likely you won't need getItem at that time, because you can load all data with the query/scan operation).

How to specify attributes to return from DynamoDB through AppSync

I have an AppSync pipeline resolver. The first function queries an ElasticSearch database for the DynamoDB keys. The second function queries DynamoDB using the provided keys. This was all working well until I ran into the 1 MB limit of AppSync. Since most of the data is in a few attributes/columns I don't need, I want to limit the results to just the attributes I need.
I tried adding AttributesToGet and ProjectionExpression (from here) but both gave errors like:
{
"data": {
"getItems": null
},
"errors": [
{
"path": [
"getItems"
],
"data": null,
"errorType": "MappingTemplate",
"errorInfo": null,
"locations": [
{
"line": 2,
"column": 3,
"sourceName": null
}
],
"message": "Unsupported element '$[tables][dev-table-name][projectionExpression]'."
}
]
}
My DynamoDB function request mapping template looks like (returns results as long as data is less than 1 MB):
#set($ids = [])
#foreach($pResult in ${ctx.prev.result})
#set($map = {})
$util.qr($map.put("id", $util.dynamodb.toString($pResult.id)))
$util.qr($map.put("ouId", $util.dynamodb.toString($pResult.ouId)))
$util.qr($ids.add($map))
#end
{
"version" : "2018-05-29",
"operation" : "BatchGetItem",
"tables" : {
"dev-table-name": {
"keys": $util.toJson($ids),
"consistentRead": false
}
}
}
I contacted the AWS people who confirmed that ProjectionExpression is not supported currently and that it will be a while before they will get to it.
Instead, I created a lambda to pull the data from DynamoDB.
To limit the results form DynamoDB I used $ctx.info.selectionSetList in AppSync to get the list of requested columns, then used the list to specify the data to pull from DynamoDB. I needed to get multiple results, maintaining order, so I used BatchGetItem, then merged the results with the original list of IDs using LINQ (which put the DynamoDB results back in the correct order since BatchGetItem in C# does not preserve sort order like the AppSync version does).
Because I was using C# with a number of libraries, the cold start time was a little long, so I used Lambda Layers pre-JITed to Linux which allowed us to get the cold start time down from ~1.8 seconds to ~1 second (when using 1024 GB of RAM for the Lambda).
AppSync doesn't support projection but you can explicitly define what fields to return in the response template instead of returning the entire result set.
{
"id": "$ctx.result.get('id')",
"name": "$ctx.result.get('name')",
...
}

AWS-Console: DynamoDB scan on nested field

I have below table in DynamoDB
{
"id": 1,
"user": {
"age": "26",
"email": "testuser#gmail.com",
"name": "test user"
}
}
Using AWS console, I want to scan all the records whose email address contains gmail.com
I am trying this but it is giving no results.
I am new to AWS, not sure what's wrong here. Is it not possible to scan on nested fields?
I've been trying to figure this out myself but it would seem that nested item scans are not supported through the console.
I'm going based off of this which offer some alternative options via CLI or SDK: https://forums.aws.amazon.com/thread.jspa?messageID=931016