How to specify attributes to return from DynamoDB through AppSync - amazon-web-services

I have an AppSync pipeline resolver. The first function queries an ElasticSearch database for the DynamoDB keys. The second function queries DynamoDB using the provided keys. This was all working well until I ran into the 1 MB limit of AppSync. Since most of the data is in a few attributes/columns I don't need, I want to limit the results to just the attributes I need.
I tried adding AttributesToGet and ProjectionExpression (from here) but both gave errors like:
{
"data": {
"getItems": null
},
"errors": [
{
"path": [
"getItems"
],
"data": null,
"errorType": "MappingTemplate",
"errorInfo": null,
"locations": [
{
"line": 2,
"column": 3,
"sourceName": null
}
],
"message": "Unsupported element '$[tables][dev-table-name][projectionExpression]'."
}
]
}
My DynamoDB function request mapping template looks like (returns results as long as data is less than 1 MB):
#set($ids = [])
#foreach($pResult in ${ctx.prev.result})
#set($map = {})
$util.qr($map.put("id", $util.dynamodb.toString($pResult.id)))
$util.qr($map.put("ouId", $util.dynamodb.toString($pResult.ouId)))
$util.qr($ids.add($map))
#end
{
"version" : "2018-05-29",
"operation" : "BatchGetItem",
"tables" : {
"dev-table-name": {
"keys": $util.toJson($ids),
"consistentRead": false
}
}
}

I contacted the AWS people who confirmed that ProjectionExpression is not supported currently and that it will be a while before they will get to it.
Instead, I created a lambda to pull the data from DynamoDB.
To limit the results form DynamoDB I used $ctx.info.selectionSetList in AppSync to get the list of requested columns, then used the list to specify the data to pull from DynamoDB. I needed to get multiple results, maintaining order, so I used BatchGetItem, then merged the results with the original list of IDs using LINQ (which put the DynamoDB results back in the correct order since BatchGetItem in C# does not preserve sort order like the AppSync version does).
Because I was using C# with a number of libraries, the cold start time was a little long, so I used Lambda Layers pre-JITed to Linux which allowed us to get the cold start time down from ~1.8 seconds to ~1 second (when using 1024 GB of RAM for the Lambda).

AppSync doesn't support projection but you can explicitly define what fields to return in the response template instead of returning the entire result set.
{
"id": "$ctx.result.get('id')",
"name": "$ctx.result.get('name')",
...
}

Related

Different default behaviors of skipping empty buckets in timeseries queries by Druid native query and Druid SQL

According to the document of Apache Druid about native query
Timeseries queries normally fill empty interior time buckets with zeroes.
For example, if you issue a "day" granularity timeseries query for the interval 2012-01-01/2012-01-04, and no data exists for 2012-01-02, you will receive:
[
{
"timestamp": "2012-01-01T00:00:00.000Z",
"result": { "sample_name1": <some_value> }
},
{
"timestamp": "2012-01-02T00:00:00.000Z",
"result": { "sample_name1": 0 }
},
{
"timestamp": "2012-01-03T00:00:00.000Z",
"result": { "sample_name1": <some_value> }
}
]
This could be controlled by the value of context flag "skipEmptyBuckets", and the default value is false (do not skip the empty bucket by zero-filling).
However, when querying timeseries data with Druid SQL, the default behavior is to skip all empty buckets. I have to set query context explicitly to get the results I want.
"context": {
"skipEmptyBuckets": true
}
This troubles me a lot because I need zero-filling to show all buckets in Apache Superset's timeseries charts. But there's no way to set the query context.
As far as I know, the SQL statement internally is translated to native query, so why is the inconsistency?

Appsync custom resolver error "Unable to transform Response Template"

I am trying to write "BatchPutItem" custom resolver, so that I can create multiple items (not more than 25 at a time), which should accept a list of arguments and then perform a batch operation.
Here is the code which I have in CusomtResolver:
#set($pdata = [])
#foreach($item in ${ctx.args.input})
$util.qr($item.put("createdAt", $util.time.nowISO8601()))
$util.qr($item.put("updatedAt", $util.time.nowISO8601()))
$util.qr($item.put("__typename", "UserNF"))
$util.qr($item.put("id", $util.defaultIfNullOrBlank($item.id, $util.autoId())))
$util.qr($pdata.add($util.dynamodb.toMapValues($item)))
#end
{
"version" : "2018-05-29",
"operation" : "BatchPutItem",
"tables" : {
"Usertable1-staging": $utils.toJson($pdata)
}
}
Response in the query console section:
{
"data": {
"createBatchUNF": null
},
"errors": [
{
"path": [
"createBatchUserNewsFeed"
],
"data": null,
"errorType": "MappingTemplate",
"errorInfo": null,
"locations": [
{
"line": 2,
"column": 3,
"sourceName": null
}
],
"message": "Unsupported operation 'BatchPutItem'. Datasource Versioning only supports the following operations (TransactGetItems,PutItem,BatchGetItem,Scan,Query,GetItem,DeleteItem,UpdateItem,Sync)"
}
]
}
And the query is :
mutation MyMutation {
createBatchUNF(input: [{seen: false, userNFUserId: "userID", userNFPId: "pID", user NFPOwnerId: "ownerID"}]) {
items {
id
seen
}
}
}
Conflict detection is also turned off
And when I check cloud logs I found this error:
I was able to solve this issue by disabling DataStore for entire API.
When we create a new project/backend app using amplify console, and in data tab it asks us to enable data store so that we can start modelling our GraphQL api, that's the reason which enables versioning for the entire API, and it prevents batchWrite to execute.
So in order to solve this issue, we can follow these steps:
[Note: I am using #aws-amplify/cli]
$ amplify update api
? Please select from one of the below mentioned services: GraphQL
? Select from the options below: Disable DataStore for entire API
And then amplify push
This will fix this issue.

AWS IoT rule - timestamp for Elasticsearch

Have a bunch of IoT devices (ESP32) which publish a JSON object to things/THING_NAME/log for general debugging (to be extended into other topics with values in the future).
Here is the IoT rule which kind of works.
{
"sql": "SELECT *, parse_time(\"yyyy-mm-dd'T'hh:mm:ss\", timestamp()) AS timestamp, topic(2) AS deviceId FROM 'things/+/stdout'",
"ruleDisabled": false,
"awsIotSqlVersion": "2016-03-23",
"actions": [
{
"elasticsearch": {
"roleArn": "arn:aws:iam::xxx:role/iot-es-action-role",
"endpoint": "https://xxxx.eu-west-1.es.amazonaws.com",
"index": "devices",
"type": "device",
"id": "${newuuid()}"
}
}
]
}
I'm not sure how to set #timestamp inside Elasticsearch to allow time based searches.
Maybe I'm going about this all wrong, but it almost works!
Elasticsearch can recognize date strings matching dynamic_date_formats.
The following format is automatically mapped as a date field in AWS Elasticsearch 7.1:
SELECT *, parse_time("yyyy/MM/dd HH:mm:ss", timestamp()) AS timestamp FROM 'events/job/#'
This approach does not require to create a preconfigured index, which is important for dynamically created indexes, e.g. with daily rotation for logs:
devices-${parse_time("yyyy.MM.dd", timestamp(), "UTC")}
According to elastic.co documentation,
The default value for dynamic_date_formats is:
[ "strict_date_optional_time","yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"]
#timestamp is just a convention as the # prefix is the default prefix for Logstash generated fields. Because you are not using Logstash as a middleman between IoT and Elasticsearch, you don't have a default mapping for #timestamp.
But basically, it is just a name, so call it what you want, the only thing that matters is that you declare it as a timestamp field in the mappings section of the Elasticsearch index.
If for some reason you still need it to be called #timestamp, you can either SELECT it with that prefix right away in the AS section (might be an issue with IoT's sql restrictions, not sure):
SELECT *, parse_time(\"yyyy-mm-dd'T'hh:mm:ss\", timestamp()) AS #timestamp, topic(2) AS deviceId FROM 'things/+/stdout'
Or you use the copy_to functionality when declaring you're mapping:
PUT devices/device
{
"mappings": {
"properties": {
"timestamp": {
"type": "date",
"copy_to": "#timestamp"
},
"#timestamp": {
"type": "date",
}
}
}
}

AppSync GraphQL mutation server logic in resolvers

I am having issues finding good sources for / figuring out how to correctly add server-side validation to my AppSync GraphQL mutations.
In essence I used AWS dashboard to define my AppSync schema, hence had DynamoDB tables created for me, plus some basic resolvers set up for the data.
No I need to achieve following:
I have a player who has inventory and gold
Player calls purchaseItem mutation with item_id
Once this mutation is called I need to perform some checks in resolver i.e. check if item_id exists int 'Items' table of associated DynamoDB, check if player has enough gold, again in 'Players' table of associated DynamoDB, if so, write to Players DynamoDB table by adding item to their inventory and new subtracted gold amount.
I believe most efficient way to achieve this that will result in less cost and latency is to use "Apache Velocity" templating language for AppSync?
It would be great to see example of this showing how to Query / Write to DynamoDB, handle errors and resolve the mutation correctly.
For writing to DynamoDB with VTL use the following tutorial
you can start with the PutItem template. My request template looks like this:
{
"version" : "2017-02-28",
"operation" : "PutItem",
"key" : {
"noteId" : { "S" : "${context.arguments.noteId}" },
"userId" : { "S" : "${context.identity.sub}" }
},
"attributeValues" : {
"title" : { "S" : "${context.arguments.title}" },
"content": { "S" : "${context.arguments.content}" }
}
}
For query:
{ "version" : "2017-02-28",
"operation" : "Query",
"query" : {
## Provide a query expression. **
"expression": "userId = :userId",
"expressionValues" : {
":userId" : {
"S" : "${context.identity.sub}"
}
}
},
## Add 'limit' and 'nextToken' arguments to this field in your schema to implement pagination. **
"limit": #if(${context.arguments.limit}) ${context.arguments.limit} #else 20 #end,
"nextToken": #if(${context.arguments.nextToken}) "${context.arguments.nextToken}" #else null #end
}
This is based on the Paginated Query template.
What you want to look at is at Pipeline Resolvers:
https://docs.aws.amazon.com/appsync/latest/devguide/pipeline-resolvers.html
Yes, this requires the VTL (Velocity Template)
That allows you to perform read, writes, validation, and anything you'd like using VTL. What you basically do is chain the inputs and outputs into the next template and make the required process.
Here's a Medium post showing you how to do it:
https://medium.com/#dabit3/intro-to-aws-appsync-pipeline-functions-3df87ceddac1
In other words, what you can do is:
Have one template that queries the database, pipeline the result to another template that validates the result and inserts it if succeeds or fails it.

How to set variable Request format in Amazon Api Gateway?

I want to make a Model for request where some part of the request structure may change.
As i don't have uniform structure here. How can i define json model for Amazon Api Gateway?
Request:
Here data inside items.{index}.data is changing according to type_id. Also we are not sure about which item with perticular type_id come at which {index}. even the type of items.{index}.data may change.
{
"name":"Jon Doe",
"items": [
{
"type_id":2,
"data": {
"km": 10,
"fuel": 20
}
},
{
"type_id": 5,
"data": [
[
"id":1,
"value":2
],
.....
]
},{
"type_id": 3,
"data": "data goes here"
},
....
]
}
How should i do this?
API Gateway uses JSON schema for model definitions. You can use a union datatype to represent your data object. See this question for an example of such a datatype.
Please note that a data model such as this will pose problems for generating SDKs. If you need SDK support for strictly typed languages, you may want to reconsider this data model.