DynamoDB getItem - expected item to be a S when its an N

DynamoDB getItem - expected item to be a S when its an N - amazon-web-services

I have one row in a table where the N value is 1 and not 0. This field is called active_duty_manager and I want to pull back the row where the value is 1 so I can get the user credentials.
When I query the table using the following code:
var params = {
AttributesToGet: ['mobile'],
TableName: 've-users',
Key: { 'is_active_duty_manager': {N:1} },
};
ddb.getItem(params, function (err, data) {
if (err) {
console.log(err);
} else { // Call DynamoDB to read the item from the table
console.log("Success, duty manager =",data.Item.user_id.N);
}
})
I get the following Error:
{ InvalidParameterType: Expected params.Key['is_active_duty_manager'].N to be a string
at ParamValidator.fail (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:50:37)
at ParamValidator.validateType (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:222:10)
at ParamValidator.validateString (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:154:32)
at ParamValidator.validateScalar (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:130:21)
at ParamValidator.validateMember (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:94:21)
at ParamValidator.validateStructure (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:75:14)
at ParamValidator.validateMember (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:88:21)
at ParamValidator.validateMap (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:117:14)
at ParamValidator.validateMember (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:92:21)
at ParamValidator.validateStructure (/Users/kevin/lambda/dynamo/node_modules/aws-sdk/lib/param_validator.js:75:14)
message: 'Expected params.Key[\'is_active_duty_manager\'].N to be a string',
code: 'InvalidParameterType',
time: 2018-02-26T20:13:09.795Z }
If I export a row as a CSV I can see the column type are S or N and, for example, active_duty_manager, is definitely a Number. So the question is why the error expects params.Key value to be a string?
Many thanks
Kevin

So looking at your table you have a primary key on user_id. This means you cannot write a query on this table which will give you the asked results, no matter what you write it won't work.
As I see it you basically have two options:
write a scan with a filter on is_active_duty_manager equals 0,
this is however a fairly expensive one as it will always read all
items.
Make a global secondary index on is_active_duty_manager and only
write 1 to it and leave it blank otherwise. This way you will get a
sparse index with just items which have this value set. You can then
query this index and this will be very fast and cheap.
When your table will be very small option 1 might still work for you. Cost optimization is a little bit out of scope here, good luck!

Looks like you need to define the key like
Key: { 'is_active_duty_manager': {'N':'1'} },
You may need to restructure your entire params with quotes.
var params = {
"AttributesToGet": ["mobile"],
"TableName": "ve-users",
"Key": { "is_active_duty_manager": {"N":"1"} },
};
https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_GetItem.html
Here is the Request Syntax from the DynamoDB API Reference:
{
"AttributesToGet": [ "string" ],
"ConsistentRead": boolean,
"ExpressionAttributeNames": {
"string" : "string"
},
"Key": {
"string" : {
"B": blob,
"BOOL": boolean,
"BS": [ blob ],
"L": [
"AttributeValue"
],
"M": {
"string" : "AttributeValue"
},
"N": "string",
"NS": [ "string" ],
"NULL": boolean,
"S": "string",
"SS": [ "string" ]
}
},
"ProjectionExpression": "string",
"ReturnConsumedCapacity": "string",
"TableName": "string"
}

Related

What's the best practice for unmarshalling data returned from a dynamo operation in aws step functions?

I am running a state machine running a dynamodb query (called using CallAwsService). The format returned looks like this:
{
Items: [
{
"string" : {
"B": blob,
"BOOL": boolean,
"BS": [ blob ],
"L": [
"AttributeValue"
],
"M": {
"string" : "AttributeValue"
},
"N": "string",
"NS": [ "string" ],
"NULL": boolean,
"S": "string",
"SS": [ "string" ]
}
}
]
}
I would like to unmarshall this data efficiently and would like to avoid using a lambda call for this
The CDK code we're currently using for the query is below
interface FindItemsStepFunctionProps {
table: Table
id: string
}
export const FindItemsStepFunction = (scope: Construct, props: FindItemStepFunctionProps): StateMachine => {
const { table, id } = props
const definition = new CallAwsService(scope, 'Query', {
service: 'dynamoDb',
action: 'query',
parameters: {
TableName: table.tableName,
IndexName: 'exampleIndexName',
KeyConditionExpression: 'id = :id',
ExpressionAttributeValues: {
':id': {
'S.$': '$.path.id',
},
},
},
iamResources: ['*'],
})
return new StateMachine(scope, id, {
logs: {
destination: new LogGroup(scope, `${id}LogGroup`, {
logGroupName: `${id}LogGroup`,
removalPolicy: RemovalPolicy.DESTROY,
retention: RetentionDays.ONE_WEEK,
}),
level: LogLevel.ALL,
},
definition,
stateMachineType: StateMachineType.EXPRESS,
stateMachineName: id,
timeout: Duration.minutes(5),
})
}

Can you unmarshall the data downstream? I'm not too well versed on StepFunctions, do you have the ability to import utilities?
Unmarshalling DDB JSON is as simple as calling the unmarshall function from DynamoDB utility:
https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/modules/_aws_sdk_util_dynamodb.html
You may need to do so downstream as StepFunctions seems to implement the low level client.

Step functions still don't make it easy enough to call DynamoDB directly from a step in a state machine without using a Lambda function. The main missing parts are the handling of the different cases of finding zero, one or more records in a query, and the unmarshaling of the slightly complicated format of DynamoDB records. Sadly the $utils library is still not supported in step functions.
You will need to implement these two in specific steps in the graph.
Here is a diagram of the steps that we use as DynamoDB query template:
The first step is used to provide parameters to the query. This step can be omitted and define the parameters in the query step:
"Set Query Parameters": {
"Type": "Pass",
"Next": "DynamoDB Query ...",
"Result": {
"tableName": "<TABLE_NAME>",
"key_value": "<QUERY_KEY>",
"attribute_value": "<ATTRIBUTE_VALUE>"
}
}
The next step is the actual query to DynamoDB. You can also use GetItem instead of Query if you have the record keys.
"Type": "Task",
"Parameters": {
"TableName": "$.tableName",
"IndexName": "<INDEX_NAME_IF_NEEDED>",
"KeyConditionExpression": "#n1 = :v1",
"FilterExpression": "#n2.#n3 = :v2",
"ExpressionAttributeNames": {
"#n1": "<KEY_NAME>",
"#n2": "<ATTRIBUTE_NAME>",
"#n3": "<NESTED_ATTRIBUTE_NAME>"
},
"ExpressionAttributeValues": {
":v1": {
"S.$": "$.key_value"
},
":v2": {
"S.$": "$.attribute_value"
}
},
"ScanIndexForward": false
},
"Resource": "arn:aws:states:::aws-sdk:dynamodb:query",
"ResultPath": "$.ddb_record",
"ResultSelector": {
"result.$": "$.Items[0]"
},
"Next": "Check for DDB Object"
}
The above example seems a bit complicated, using both ExpressionAttributeNames and ExpressionAttributeValues. However, it makes it possible to query on nested attributes such as item.id.
In this example, we only take the first item response with $.Items[0]. However, you can take all the results if you need more than one.
The next step is to check if the query returned a record or not.
"Check for DDB Object": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.ddb_record.result",
"IsNull": false,
"Comment": "Found Context Object",
"Next": "Parse DDB Object"
}
],
"Default": "Do Nothing"
}
And lastly, to answer your original question, we can parse the query result, in case that we have one:
"Parse DDB Object": {
"Type": "Pass",
"Parameters": {
"string_object.$": "$.ddb_record.result.string_object.S",
"bool_object.$": "$.ddb_record.result.bool_object.Bool",
"dict_object": {
"nested_dict_object.$": "$.ddb_record.result.item.M.name.S",
},
"dict_object_full.$": "States.StringToJson($.ddb_record.result.JSON_object.S)"
},
"ResultPath": "$.parsed_ddb_record",
"End": true
}
Please note that:
Simple strings are easily converted by "string_object.$": "$.ddb_record.result.string_object.S"
The same for numbers or booleans by "bool_object.$": "$.ddb_record.result.bool_object.Bool")
Nested objects are parsing the map object ("item.name.$": "$.ddb_record.result.item.M.name.S", for example)
Creation of a JSON object can be achieved by using States.StringToJson
The parsed object is added as a new entry on the flow using "ResultPath": "$.parsed_ddb_record"

Access an Array Item by index in AWS Dynamodb Query Results "Items" in Step Function

I have this dynamodb:Query in my step function:
{
"Type": "Task",
"Resource": "arn:aws:states:::aws-sdk:dynamodb:query",
"Next": "If nothing returned by query Or Study not yet Zipped",
"Parameters": {
"TableName": "TEST-StudyProcessingTable",
"ScanIndexForward": false,
"Limit": 1,
"KeyConditionExpression": "OrderID = :OrderID",
"FilterExpression": "StudyID = :StudyID",
"ExpressionAttributeValues": {
":OrderID": {
"S.$": "$.body.order_id"
},
":StudyID": {
"S.$": "$.body.study_id"
}
}
},
"ResultPath": "$.processed_files"
}
The results comes in as an array called Items which is nested under my ResultPath
processed_files.Items:
{
"body": {
"order_id": "1001",
"study_id": "1"
},
"processed_files": {
"Count": 1,
"Items": [
{
"Status": {
"S": "unzipped"
},
"StudyID": {
"S": "1"
},
"ZipFileS3Key": {
"S": "path/to/the/file"
},
"UploadSet": {
"S": "4"
},
"OrderID": {
"S": "1001"
},
"UploadSet#StudyID": {
"S": "4#1"
}
}
],
"LastEvaluatedKey": {
"OrderID": {
"S": "1001"
},
"UploadSet#StudyID": {
"S": "4#1"
}
},
"ScannedCount": 1
}
}
My question is how do i access the items inside this array from a choice state in a step function?
I need to query then decide something based on the results by checking the item in a condition in a choice state.
The problem is that since this is an array I can't access it using regular JsonPath (like with Items.item), and in my next step the choice condition does NOT accept an index like processed_files.Items['0'].Status

Ok so the answer was so simple all you need to do is use a number instead of string for the array index like this.
processed_files.Items[0].Status
I was originally mislead by an error I received which said that it expected a ' or '[' after the first '['. I mistakenly thought this meant it only accepts strings.
I was wrong, it works like any other array.
I hope this helps somebody one day.

AWS AppSync Nested Resolver - how to reuse arguments from parent

Heyo. I've got an AppSync resolver with a field that is attached to a resolver. The query accepts an argument that is the same argument the inner resolver would need. For the sake of terseness, I'd like to just pass it down from context instead of having to specify it. The datasource for the resolver is a dynamoDB table
Say the schema looks like
type Query {
getThings(key: String!): AResult!
}
type AResult {
getOtherThings(key: String!): String!
}
I could construct a query as such
query Query {
getThings(key: "123") {
getOtherThings(key: "123")
}
}
Which is clumsy and redundant. Ideally, I'd just want to create a query that looks like
query Query {
getThings(key: "123") {
getOtherThings
}
}
And the resolver can pull key from the context of the request and reuse it.
The request template for getOtherThings resolver looks like:
{
"version": "2017-02-28",
"operation": "Query",
"query": {
"expression" : "key = :key",
"expressionValues" : {
":key" : $util.dynamodb.toDynamoDBJson($context.arguments.key)
}
}
}
But $context.guments.key is null. As is $context.args.key and $ctx.args.key and $ctx.arguments.key. If I examine the logs from the request when executing getThings I can see the expected arguments:
{
"logType": "RequestMapping",
"path": [
"getThings"
],
"fieldName": "getThings",
"context": {
"arguments": {
"key": "123"
},
"stash": {},
"outErrors": []
},
"fieldInError": false,
"errors": [],
"parentType": "Query"
}
So I surmise that the context does not persist between the parent resolver (getThings) and its child resolver (getOtherThings), but I can't make this out from the logs.
Is this even possible - I'm coming up dry on searching through AWS logs

The answer lies in ctx.source. ctx.source is a map of the parent field, so I can grab it from there.
{
"logType": "RequestMapping",
"path": [
"getThings"
],
"source": {
"key":"123"
},
"fieldName": "getThings",
"context": {
"arguments": {
"key": "123"
},
"stash": {},
"outErrors": []
},
"fieldInError": false,
"errors": [],
"parentType": "Query"
}

Pinot nested json ingestion

I have this json schema
{
"name":"Pete"
"age":24,
"subjects":[
{
"name":"maths"
"grade":"A"
},
{
"name":"maths"
"grade":"B"
}
]
}
and I want to ingest this into a pinot table to run a query like
select age,subjects_grade,count(*) from table group by age,subjects_grade
Is there a way to do this in a pinot job?

Pinot has two ways to handle JSON records:
1. Flatten the record during ingestion time:
In this case, we treat each nested field as a separated field, so need to:
Define those fields in the table schema
Define transform functions to flatten nested fields in table config
Please see how column subjects_name and subjects_grade is defined below. Since it's an array, so both fields are multi-value columns in Pinot.
2. Directly ingest JSON records
In this case, we treat each nested field as one single field, so need to:
Define the JSON field in table schema as a string with maxLength value
Put this field into noDictionaryColumns and jsonIndexColumns in table config
Define transform functions jsonFormat to stringify the JSON field in table config
Please see how column subjects_str is defined below.
Below is the sample table schema/config/query:
Sample Pinot Schema:
{
"metricFieldSpecs": [],
"dimensionFieldSpecs": [
{
"dataType": "STRING",
"name": "name"
},
{
"dataType": "LONG",
"name": "age"
},
{
"dataType": "STRING",
"name": "subjects_str"
},
{
"dataType": "STRING",
"name": "subjects_name",
"singleValueField": false
},
{
"dataType": "STRING",
"name": "subjects_grade",
"singleValueField": false
}
],
"dateTimeFieldSpecs": [],
"schemaName": "myTable"
}
Sample Table Config:
{
"tableName": "myTable",
"tableType": "OFFLINE",
"segmentsConfig": {
"segmentPushType": "APPEND",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
"schemaName": "myTable",
"replication": "1"
},
"tenants": {},
"tableIndexConfig": {
"loadMode": "MMAP",
"invertedIndexColumns": [],
"noDictionaryColumns": [
"subjects_str"
],
"jsonIndexColumns": [
"subjects_str"
]
},
"metadata": {
"customConfigs": {}
},
"ingestionConfig": {
"batchIngestionConfig": {
"segmentIngestionType": "APPEND",
"segmentIngestionFrequency": "DAILY",
"batchConfigMaps": [],
"segmentNameSpec": {},
"pushSpec": {}
},
"transformConfigs": [
{
"columnName": "subjects_str",
"transformFunction": "jsonFormat(subjects)"
},
{
"columnName": "subjects_name",
"transformFunction": "jsonPathArray(subjects, '$.[*].name')"
},
{
"columnName": "subjects_grade",
"transformFunction": "jsonPathArray(subjects, '$.[*].grade')"
}
]
}
}
Sample Query:
select age, subjects_grade, count(*) from myTable GROUP BY age, subjects_grade
select age, json_extract_scalar(subjects_str, '$.[*].grade', 'STRING') as subjects_grade, count(*) from myTable GROUP BY age, subjects_grade
Comparing both ways, we recommend solution 1 to flatten the nested fields out when the field density is high(e.g. every document has field name and grade, then it's worth extracting them out to be new columns), it gives better query performance and better storage efficiency.
For solution 2, it's simpler in configuration, and good for sparse fields(e.g. only a few documents have certain fields). It requires to use json_extract_scalar function to access the nested field.
Please also note the behavior of Pinot GROUP BY on multi-value columns.
More references:
Pinot Column Transformation
Pinot JSON Functions
Pinot JSON Index
Pinot Multi-value Functions

AWS State Machine - Update DynamoDB table is replacing the ID with "$.id"

I have a step where I want to update a object on a DynamoDB table.
Everything works except its creating a new object with the ID value of "$.id", instead of updating where the ID I pass in.
This is my first state machine attempt so what have I done wrong here?
"update-table-processing": {
"Type": "Task",
"Resource": "arn:aws:states:::dynamodb:updateItem",
"ResultPath": "$.updateResult",
"Parameters": {
"TableName": "Projects",
"Key": {
"id": {
"S": "$.id"
}
},
"UpdateExpression": "SET step = :updateRef",
"ExpressionAttributeValues": {
":updateRef": {
"S": "processing"
}
},
"ReturnValues": "ALL_NEW"
},
"Next": "create-project"
},
Do I somehow need to tell DynamoDB to evaluate "$.id" rather than treating it as a "S", or is this happening because I've not mapped the input correctly that the "$.id" value is empty?
My input looks like:
{
"id": "f8185735-c90d-4d4e-8689-cec68a48b1bc"
}

In order to specify data from your input you have to use a Key-Value pair, with the key value ending in a ".$". So to fix this you need to change it to:
"Key": {
"id": {
"S.$": "$.id"
}
},
Using the above it should correctly resolve to the value from your input instead of the string value "$.id".
References - https://docs.aws.amazon.com/step-functions/latest/dg/input-output-inputpath-params.html#input-output-parameters

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

DynamoDB getItem - expected item to be a S when its an N - amazon-web-services

Related

What's the best practice for unmarshalling data returned from a dynamo operation in aws step functions?

Access an Array Item by index in AWS Dynamodb Query Results "Items" in Step Function

AWS AppSync Nested Resolver - how to reuse arguments from parent

Pinot nested json ingestion

AWS State Machine - Update DynamoDB table is replacing the ID with "$.id"

Categories

Resources