I am trying to learn how to use AppSync and its DynamoDB integrations.
I have successfully created an AppSync GraphQL API and linked a resolver to a getter on the primary key and thought I understood what is happening. However, I can not get a putItem resolver to work at all and am struggling to find a useful way to debug the logic.
There is a cdk repository here which will deploy the app. Lines 133-145 have a hand written schema which I thought should work however that receives the error
One or more parameter values were invalid: Type mismatch for key food_name expected: S actual: NULL (Service: DynamoDb, Status Code: 400
I also have attempted to wrap the expressions in quotes but receive errors.
Where should I go from here?
The example data creates a table with keys
food_name
scientific_name
group
sub_group
with food_name as the primary key.
https://github.com/AG-Labs/AppSyncTask
Today I have attempted to reimplement the list resolver as
{
"version" : "2017-02-28",
"operation" : "Scan",
## Add 'limit' and 'nextToken' arguments to this field in your schema to implement pagination. **
"limit": $util.defaultIfNull(${ctx.args.limit}, 20),
"nextToken": $util.toJson($util.defaultIfNullOrBlank($ctx.args.nextToken, null))
}
with a response mapping of
$util.toJson($ctx.result.items)
In cloud watch I can see a list of results under log type ResponseMapping (albeit not correctly filtered but i'll ignore that for now) but these do not get returned to the querier. That result is simply
{
"data": {
"listGenericFoods": {
"items": null
}
}
}
I don't understand where this is going wrong.
The problem was that the resolvers were nested.
Writing a handwritten schema fixed the issue but resulted in a poorer API. Going back a few steps and will implement from the ground up slowly adding more resolvers.
The CloudWatch Logs once turned on helped somewhat but still required a lot of changing the resolvers ever so slightly and retrying.
Related
I am using a graphql API with AppSync that receives post requests from a lambda function that is triggered by AWS IoT with sensor data in the following JSON format:
{
"scoredata": {
"id": "240",
"distance": 124,
"timestamp": "09:21:11",
"Date": "04/16/2022"
}
}
The lambda function uses this JSON object to perform a post request on the graphql API, and AppSync puts this data in DynamoDB to be stored. My issue is that whenever I parse the JSON object within my lambda function to retrieve the id value, the id value does not match with the id value stored in DynamoDB; appsync is seemingly automatically generating an id.
Here is a screenshot of the request made to the graphql api from cloudwatch:
Here is what DynamoDB is storing:
I would like to know why the id in DynamoDB is shown as 964a3cb2-1d3d-4f1e-a94a-9e4640372963" when the post request id value is "240" and if there is anything I can do to fix this.
I can’t tell for certain but i’m guessing that dynamo db schema is autogenerating the id field on insert and using a uuid as the id type. An alternative would be to introduce a new property like score_id to store this extraneous id.
If you are using amplify most likely the request mapping templates you are generating automatically identify the "id" field as a unique identifier to be generated at runtime.
I recommend you to take a look at your VTL request template, you will most likely find something like this:
$util.qr($context.args.input.put("id", $util.defaultIfNull($ctx.args.input.id, $util.autoId())))
Surely the self-generated id comes from $util.autoId()
Probably some older version of Amplify could omit the verification $util.defaultIfNull($ctx.args.input.id,... and always overwrite the id by self-generating it.
I've been working on an application hosted on the AWS cloud that is part of a data pipeline. The application processes events from EventBridge, does some data mapping and then puts the result on a Kinesis stream.
The incoming events payload looks something like this (truncated for readability):
{
"version": "0",
"id": "9a0f9e20-c518-a968-7fa6-1d8038a5bcfc",
"detail-type": "Some sort of event",
....
}
and the event put onto the Kinesis stream looks something like:
{
"eventId": "9a0f9e20-c518-a968-7fa6-1d8038a5bcfc",
"eventTime": "2021-04-08T06:19:47.683Z",
"eventType": "created",
...
}
I looked at the "id" attribute on the incoming event and at first glance it looks like a UUID. I put a few examples into an online validator and it came back as a valid UUID. Since it is a UUID and is supposed to be "universally unique" I thought I might just reuse that ID for the "eventId" attribute of the outgoing payload. I thought that might even make it easier to trace events back to the source.
However, when I started my integration testing I started to notice alarms going off on unrelated services. There were validation errors happening all over the place. Turns out that the downstream services didn't like the format of "eventId".
The downstream services use the "uuid" NPM module to validate UUIDs in our event envelopes and it seems like it doesn't like the UUIDs that come from AWS. To make sure that I had diagnosed the problem correctly I fired up a node REPL and tried to validate one of the UUIDs that came through and sure enough it came back as invalid!
> const uuid = require('uuid');
> u.validate("9a0f9e20-c518-a968-7fa6-1d8038a5bcfc")
false
I then checked the regex that the 'uuid' module was using to do the validation and I noticed that it was checking for the numbers 1-5 in the first character of the third group of the UUID.
Confused, I checked out the Wikipedia page for UUIDs and discovered that the UUID version of the UUIDs coming from AWS is A, instead of the expected version numbers (1-5)
I have a few related questions:
Why does AWS have it's own UUID version?
Is it even a UUID?
Why would AWS go and violate the principle of least
astonishment
like that, surely it's easier to just use a regular UUID?
I'm hoping someone has an interesting story about how AWS had to invent their own UUID version to deal with some epic engineering problem that only happens at their scale, but I suppose I'll settle for a more simple answer.
Following AWS Personalize documents, I successfully imported my datasets (User, Item, Interaction) from S3, created an EventTrcker, trained the model, and deployed the campaign. The solution works without any issue and I get the recommendations.
I rely on Putevent to add new user-item interaction events. I also dump those interaction events using Lambda+firehose in my s3. But I am wondering if AWS Personalize internally creates/augments the original user-item interaction dataset? How I can access and download the revised version of the dataset? I cannot see any new dataset in "Dataset groups > Datasets" rather than my original 3 datasets...
I prefer to dump it regularly from AWS Personalize to my S3 storage rather than using my own Lambda+Firehose solution.
This is the output of my Putevent call. I see 200...but not sure it works fine or not...should I see any new dataset in "Dataset groups > Datasets" created by putevents?
{
"ResponseMetadata": {
"RequestId": "a6c96496-cbd6-4ad8-9183-371d1794cbd8",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"content-type": "application/json",
"date": "Mon, 04 Jan 2021 18:04:28 GMT",
"x-amzn-requestid": "a6c96496-cbd6-4ad8-9183-371d1794cbd8",
"content-length": "0",
"connection": "keep-alive"
},
"RetryAttempts": 0
}
}
Update: Now it's possible
AWS documentation:
https://docs.aws.amazon.com/personalize/latest/dg/export-data.html
You can use this AWS CLI command for exporting only interactions, that were added but PutEvents/PutUsers/PutItems API calls:
aws personalize create-dataset-export-job \
--job-name job name \
--dataset-arn dataset ARN \
--job-output "{\"s3DataDestination\":{\"kmsKeyArn\":\"kms key ARN\",\"path\":\"s3://bucket-name/folder-name/\"}}" \
--role-arn role ARN \
--ingestion-mode PUT
In that case --ingestion-mode PUT will make sure, that:
Specify PUT to export only data that you imported incrementally using the console or the PutEvents, PutUsers, or PutItems operations.
So I believe it covers your use case.
No, it's not possible
It's simply impossible right now to export this data.
There is no API to retrieve a dump of your Interactions dataset in Personalize.
I believe Lambda + Firehose workaround for this is correct approach.
But how to test, if PutEvents works?
To make sure, that Interactions added through PutEvents, you can make use of Filters feature:
https://docs.aws.amazon.com/personalize/latest/dg/filter-expressions.html
Pretty much create a new Filter, with similar expression:
EXCLUDE ItemID WHERE Interactions.EVENT_TYPE IN ("your_event_type_name")
Which will exclude from recommendations any item, that user previously interacted with.
Then you can test, if events added through PutEvents API are recognized correctly:
Create Filter expression as described above.
Create any campaign for simple recommendations (User-Personalization recipe).
Connect the filter to campaign.
Get recommendations for any user and save them somewhere.
Call PutEvents API with any of the recommended items, that was returned in 4 and user id from 4.
Again get recommendations for the same user as in 4.
If the item, that you did added with PutEvents call is no longer recommended, then you have a proof, that events added through PutEvents call are correctly added to Interactions dataset.
What if PutEvents call doesn't affect recommendations in that case?
Then simply you are providing incorrect values in API call. Personalize might return 200 response, even if event provided was invalid.
To fix that, try:
Make sure date is in correct format. Personalize might ignore events with very old timestamps, if there are much more newer events (it's possible to configure it in Solution config).
Check if you are not passing any strange values like "null" or "undefined" for sessionId, userId, trackingId in PutEvents params. It might cause ignoring the event by Personalize (https://github.com/aws/aws-sdk-js/issues/3371)
Make sure, you are passing correct eventType value (should match eventType in Solution and Filter).
If it still doesn't work, raise a support ticket to AWS with an example PutEvents API call params.
Are there any simpler solutions?
Well, maybe there are, but in our project we use this approach and it also tests, if filtering feature is working correctly. You will probably make use of Filtering anyways in the future, so I believe it's good enough method.
Is there a simple way to retrieve all items from a DynamoDB table using a mapping template in an API Gateway endpoint? I usually use a lambda to process the data before returning it but this is such a simple task that a Lambda seems like an overkill.
I have a table that contains data with the following format:
roleAttributeName roleHierarchyLevel roleIsActive roleName
"admin" 99 true "Admin"
"director" 90 true "Director"
"areaManager" 80 false "Area Manager"
I'm happy with getting the data, doesn't matter the representation as I can later transform it further down in my code.
I've been looking around but all tutorials explain how to get specific bits of data through queries and params like roles/{roleAttributeName} but I just want to hit roles/ and get all items.
All you need to do is
create a resource (without curly braces since we dont need a particular item)
create a get method
use Scan instead of Query in Action while configuring the integration request.
Configurations as follows :
enter image description here
now try test...you should get the response.
to try it out on postman deploy the api first and then use the provided link into postman followed by your resource name.
API Gateway allows you to Proxy DynamoDB as a service. Here you have an interesting tutorial on how to do it (you can ignore the part related to index to make it work).
To retrieve all the items from a table, you can use Scan as the action in API Gateway. Keep in mind that DynamoDB limits the query sizes to 1MB either for Scan and Query actions.
You can also limit your own query before it is automatically done by using the Limit parameter.
AWS DynamoDB Scan Reference
I'm using aws appsync with react native and there is transaction happening offline and I want to know if mydata that is being transact offline is already saved in my db online.
The fetch policy I'm using is already network-only, but "network-only" policy is not working because it can still catch data if it is offline.
If you are using DynamoDB with AppSync, you can add a condition expression to your mutation resolver request mapping template. DynamoDB conditions are used to validate whether the mutation should succeed or not.
Many people use versions with DynamoDB condition checks to validate that a record hasn't already been updated, but you can add additional fields to keep track of whether the transaction has already been made.
Here is an example condition expression that you can add to your request mapping template to validate the incoming mutation:
"condition" : {
"expression" : "version = :expectedVersion",
"expressionValues" : {
":expectedVersion" : { "N" : ${context.arguments.expectedVersion} }
}
}
Here is is an overly comprehensive guide to using DynamoDB resolvers:
https://docs.aws.amazon.com/appsync/latest/devguide/tutorial-dynamodb-resolvers.html#modifying-the-updatepost-resolver-dynamodb-updateitem