Amazon Cognito Streams gives garbage data in the Lambda events - amazon-web-services

I have configured Lambda to read from Kinesis Stream that Cognito Sync writes to. I get see an event in Lambda logs for every CognitoSync called, but the event does not contain the data key-values that I have set in dataset. I do see the key-value being sent to Cognito in the request.
Event that Lambda gets, looks like following. How do I get the key-value in dataset from this.
2015-03-07T16:18:40.082Z 9be3582e-c4e5-11e4-be53-6f01632e7b6d
{
"Records": [
{
"eventSource": "aws:kinesis",
"kinesis": {
"partitionKey": "us-east-1:d4bfff5d-9605-484d-9aab-0e63829b1e54-Fia",
"kinesisSchemaVersion": "1.0",
"data": "eyJpZGVudGl0eVBvb2xJZCI6InVzLWVhc3QtMTowMmFiM2JiYi04N2RlLTQyMzUtYWEyZS1kNzliYzQ1YmFmOTciLCJpZGVudGl0eUlkIjoidXMtZWFzdC0xOmQ0YmZmZjVkLTk2MDUtNDg0ZC05YWFiLTBlNjM4MjliMWU1NCIsImRhdGFzZXROYW1lIjoiRmlhciIsIm9wZXJhdGlvbiI6InJlcGxhY2UiLCJwYXlsb2FkVHlwZSI6IklubGluZSIsImtpbmVzaXNTeW5jUmVjb3JkcyI6W3sia2V5IjoiU3RhdGUiLCJ2YWx1ZSI6IltbXCItXCIsXCItXCIsXCItXCIsXCItXCIsXCItXCIsXCItXCIsXCItXCJdLFtcIi1cIixcIi1cIixcIi1cIixcIi1cIixcIi1cIixcIi1cIixcIi1cIl0sW1wiT1wiLFwiLVwiLFwiLVwiLFwiLVwiLFwiLVwiLFwiLVwiLFwiLVwiXSxbXCJYXCIsXCItXCIsXCItXCIsXCItXCIsXCItXCIsXCItXCIsXCItXCJdLFtcIk9cIixcIi1cIixcIi1cIixcIi1cIixcIi1cIixcIi1cIixcIi1cIl0sW1wiWFwiLFwiLVwiLFwiLVwiLFwiLVwiLFwiLVwiLFwiLVwiLFwiLVwiXV0iLCJzeW5jQ291bnQiOjYsImxhc3RNb2RpZmllZERhdGUiOjE0MjU3NDUxMTQ3NjMsImRldmljZUxhc3RNb2RpZmllZERhdGUiOjE0MjU3NDUxMTE0NDAsIm9wIjoicmVwbGFjZSJ9XSwia2luZXNpc1N5bmNSZWNvcmRzVVJMIjpudWxsLCJsYXN0TW9kaWZpZWREYXRlIjoxNDI1NzQ1MTE0NzYzLCJzeW5jQ291bnQiOjZ9",
"sequenceNumber": "49548516359756600751834810213344902796782628138546888706"
},
"eventID": "shardId-000000000000:49548516359756600751834810213344902796782628138546888706",
"invokeIdentityArn": "arn:aws:iam::111111111111:role/LambdaKinesisInvocationRole-funcog",
"eventName": "aws:kinesis:record",
"eventVersion": "1.0",
"eventSourceARN": "arn:aws:kinesis:us-east-1:111111111111:stream/funcog",
"awsRegion": "us-east-1"
}
]
}

It appears that the data you get from Kinesis (in the .Records[0].kinesis.data element) is Base64-encoded. Decoding gives the following:
{"identityPoolId":"us-east-1:02ab3bbb-87de-4235-aa2e-d79bc45baf97","identityId":"us-east-1:d4bfff5d-9605-484d-9aab-0e63829b1e54","datasetName":"Fiar","operation":"replace","payloadType":"Inline","kinesisSyncRecords":[{"key":"State","value":"[[\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"],[\"-\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"],[\"O\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"],[\"X\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"],[\"O\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"],[\"X\",\"-\",\"-\",\"-\",\"-\",\"-\",\"-\"]]","syncCount":6,"lastModifiedDate":1425745114763,"deviceLastModifiedDate":1425745111440,"op":"replace"}],"kinesisSyncRecordsURL":null,"lastModifiedDate":1425745114763,"syncCount":6}
So in your Lambda function, you will need to parse this data. One way to do so might be the following:
var data = JSON.parse(new Buffer(event.Records[0].kinesis.data, 'base64').toString('ascii'));
console.log("Key: " + data.kinesisSyncRecords[0].key);
// etc...

Related

AWS EventBridge Pipes: Is it possible to apply enrichment input transformer on an array of events?

I'm evaluating EventBridge Pipes and I have the following setup:
Source
Filtering
Enrichment
Target
Kinesis stream →
Not configured →
Step Functions →
SNS Topic
Additionally the batch size for the kinesis stream is set > 1. In this szenario this does mean, that the Step Function retrieves an array of kinesis events.
The enrichment input transformer works fine as long as I provide a single event:
event:
{
"kinesisSchemaVersion": "1.0",
"partitionKey": "1",
"sequenceNumber": "49590338271490256608559692538361571095921575989136588898",
"data": "SGVsbG8sIHRoaXMgaXMgYSB0ZXN0Lg==",
"approximateArrivalTimestamp": 1545084650.987,
"eventSource": "aws:kinesis",
"eventVersion": "1.0",
"eventID": "shardId-000000000006:49590338271490256608559692538361571095921575989136588898",
"eventName": "aws:kinesis:record",
"invokeIdentityArn": "arn:aws:iam::123456789012:role/lambda-role",
"awsRegion": "us-east-2",
"eventSourceARN": "arn:aws:kinesis:us-east-2:123456789012:stream/lambda-stream"
}
transformer:
{
"data": <$.data>
}
output:
{
"data": "Hello, this is a test."
}
The desired behaviour is to get base64 decoded data payload, which works fine. However if I provide an array of events this transformer is not correct anymore and provides empty value.
[{
...
"data": "SGVsbG8sIHRoaXMgaXMgYSB0ZXN0Lg==",
...
},
{
...
"data": "SGVsbG8sIHRoaXMgaXMgYSB0ZXN0Lg==",
...
}
]
One possible transformer I figured out looks like this:
{
"data": <$.0.data>
}
The problem here is I am getting only the decoded payload for the first element.
My goal is, of course, to get the decoded payload for each event in the array.
Do You have any ideas on how to approach this?

Passing variable values through S3 and SQS event trigger message

I have setup the aws pipeline as S3 -> SQS -> Lambda. S3 PutObject event will generate an event trigger message and pass it to SQS and SQS will trigger the lambda. I have a requirement to pass a variable value from S3 to SQS and finally to Lambda as part of the event message. Variable value could be the file name or some string value.
can we customize the event message json data generated by S3 event to pass some more information along with the message.
Does SQS just pass the event message received from S3 to Lambda or does any alteration to the message or generate its own message.
how to display or see the message generated by S3 in SQS or Lambda.
You can't manipulate the S3 event data. The schema looks like this. That will be passed onto the SQS Queue which will add some it's own metadata and pass it along to Lambda. This tutorial has a sample SQS record.
When Amazon S3 triggers an event, a message is sent to the desired destination (AWS Lambda, Amazon SNS, Amazon SQS). The message includes the bucket name and key (filename) of the object that triggered the event.
Here is a sample event (from Using AWS Lambda with Amazon S3 - AWS Lambda):
{
"Records": [
{
"eventVersion": "2.1",
"eventSource": "aws:s3",
"awsRegion": "us-east-2",
"eventTime": "2019-09-03T19:37:27.192Z",
"eventName": "ObjectCreated:Put",
"userIdentity": {
"principalId": "AWS:AIDAINPONIXQXHT3IKHL2"
},
"requestParameters": {
"sourceIPAddress": "205.255.255.255"
},
"responseElements": {
"x-amz-request-id": "D82B88E5F771F645",
"x-amz-id-2": "vlR7PnpV2Ce81l0PRw6jlUpck7Jo5ZsQjryTjKlc5aLWGVHPZLj5NeC6qMa0emYBDXOo6QBU0Wo="
},
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "828aa6fc-f7b5-4305-8584-487c791949c1",
"bucket": {
"name": "lambda-artifacts-deafc19498e3f2df",
"ownerIdentity": {
"principalId": "A3I5XTEXAMAI3E"
},
"arn": "arn:aws:s3:::lambda-artifacts-deafc19498e3f2df"
},
"object": {
"key": "b21b84d653bb07b05b1e6b33684dc11b",
"size": 1305107,
"eTag": "b21b84d653bb07b05b1e6b33684dc11b",
"sequencer": "0C0F6F405D6ED209E1"
}
}
}
]
}
The bucket can be obtained from Records[].s3.bucket.name and the key can be obtained from Records[].s3.object.key.
However, there is no capability to send a particular value, since S3 triggers the event. However, you could possibly derive a value. For example, if you had events from several different buckets triggering the Lambda function, then the Lambda function could look at the bucket name to determine why it was triggered, and then substitute a desired value.

How can I get my dynamodb lambda function to check the user's identity?

I've set up a dynamodb lambda trigger using this documentation. The function successfully triggers when the dynamodb table is updated, and I can view the output just fine.
I want to find the identity of the user that updated the dynamodb table but this info doesn't seem to be included in the event. How can I accomplish this?
The event looks like this:
{
"Records": [
{
"eventID": "1725dad5b286b22b02cffc28e5006437",
"eventName": "INSERT",
"eventVersion": "1.1",
"eventSource": "aws:dynamodb",
"awsRegion": "us-west-2",
"dynamodb": {
"ApproximateCreationDateTime": 1607759729,
"Keys": {
"receiver": {
"S": "c2217b13-12e8-42a4-a1ab-627f764493c9"
},
"sender": {
"S": "6fad5bc8-a389-4d73-b171-e709d5d8bdd8"
}
},
"NewImage": {
"createdAt": {
"S": "2020-12-12T07:55:29.105Z"
},
"receiver": {
"S": "c2217b13-12e8-42a4-a1ab-627f764493c9"
},
"sender": {
"S": "6fad5bc8-a389-4d73-b171-e709d5d8bdd8"
},
"__typename": {
"S": "FriendRequest"
},
"updatedAt": {
"S": "2020-12-12T07:55:29.105Z"
}
},
"SequenceNumber": "4092400000000003379896405",
"SizeBytes": 261,
"StreamViewType": "NEW_AND_OLD_IMAGES"
},
"eventSourceARN": "arn:aws:dynamodb:us-west-2:213277979580:table/FriendRequest-rmzuppsajfhzlfgjehczargowa-apisecure/stream/2020-12-11T07:48:02.462"
}
]
}
DynamoDB does not provide the ID of the user that did write the record. The only way you can achieve this is to have the user id be part of the DynamoDB item in the first place. But that means your application needs to identify the user and write that attribute.
But this obviously will not work, if the item is inserted using the AWS Console. In that case your user would need to insert his own ID (or the ID of another user) by hand.
We have a similar scenario in which we would like to track who (user) or what (service) made the last update to a record. first our arch looks like:
User/Service changes -> API Lambda -> DynamoDB Stream -> Lambda (normalizes stream data) -> Service Event SNS Topic
All services or functions that care about changes are attached to the SNS topic NOT the stream.
This works fine with insert/update, we have a internal use field in which we keep this data, it is not returned via the CRUD API's. something like: updated_by: app:service:region:user/1
When we get the record we know that this item was updated by user with id 1. so when we create the sns topic, we add a message attribute with this value.
Deletion is a bit more tricky since you can't really update and delete an item at the exact same time. how we do this currently is to generate a delete event manually on deletion so instead of relying on the stream, we async the lambda function and pass along the user/service data.
User/Service changes -> API Lambda -> Lambda (normalizes stream data) -> Service Event SNS Topic

AWS - how to pass DynamoDb table data to Lambda function

Below is my customer table in DynamoDb
name: string
I have linked a trigger that would call Lambda function that in turn calls my app Endpoint which would do data transformation & save in SQL DB
Whenever I am adding any record or updating record in the above table, I can see that Lambda function is getting called. but not sure how can I capture the table data.
I need to capture the name value of the customer dynamoDb table via Lambda function which I can pass to my Endpoint.
Newbie to this. So please excuse if it's too simple. But couldn't find the info that could drive this for me.
Thanks!
You Lambda function will receive a DynamoDB Streams Record Event (see Using AWS Lambda with Amazon DynamoDB for an example event).
You are going to map/loop over the Records key where you will find objects with eventName: INSERT. Inside the dynamodb key you will find the table data that you should process in your Lamdba function's code.
{
"Records": [
{
"eventID": "1",
"eventVersion": "1.0",
"dynamodb": {
"Keys": {
"Id": {
"N": "101"
}
},
"NewImage": {
"Message": {
"S": "New item!"
},
"Id": {
"N": "101"
}
},
"StreamViewType": "NEW_AND_OLD_IMAGES",
"SequenceNumber": "111",
"SizeBytes": 26
},
"awsRegion": "us-west-2",
"eventName": "INSERT",
"eventSourceARN": eventsourcearn,
"eventSource": "aws:dynamodb"
}
]
}
In your case, the data should be located at Records[0].dynamodb.NewImage.name.S
If working with Node.js and mixed types in your table, I suggest using AWS.DynamoDB.Converter.unmarshall which converts a DynamoDB record into a JavaScript object. It allows you to do something like this:
const newImage = DynamoDB.Converter.unmarshall(event.Records[0].dynamodb.NewImage);

Connecting kinesis analytics to lambda direcltly and also indirectly by using kinesis stream results in different output

I am facing a very strange problem and I am not sure if it is a bug in aws or my lack of understanding.
Here is the problem.
I have a kinesis analytins reoprting every 40 seconds. so lets say it reported the following 40 second ago:
{frist row with some data}, {second row with some data}
So I connect the kinesis analitycs to two destination: 1) lambda: and the result lambda is recieving is as follows:
{
"invocationId": "2db390f4-d5a1-49cf-a792-73827d37ec34",
"applicationArn": "arn:aws:kinesisanalytics:us-east-1:638417958056:application/-analytic-app-trending-stories",
"records": [
{
"recordId": "672e17e4-06a9-41cb-a5ba-341cf5f3b879",
"lambdaDeliveryRecordMetadata": {
"retryHint": 0
},
"data": "eyJzSWQiOjQ0NDQ0LCJjSWQiOjExMjIyLCJjb3VudCI6MjAuMCwicm93VGltZTEiOiIyMDE4LTAxLTE2IDE0OjE2OjQwLjAwMSJ9"
},
{
"recordId": "6fbb409d-ddb0-40b7-b700-45b7ce87abe2",
"lambdaDeliveryRecordMetadata": {
"retryHint": 0
},
"data": "eyJzSWQiOjQ0NDQ0LCJjSWQiOjIyMiwiY291bnQiOjEzLjAsInJvd1RpbWUxIjoiMjAxOC0wMS0xNiAxNDoxNjo0MC4wMDEifQ=="
}
]
}
So as you can see two rows were sent to lambda in one payload and obviously we have to data in array format seond to lambda. After decoding the content of the data in each record I see the same results sent from analytics.
So far so good but the problem starts here:
I connected analytics to kinesis stream and then I connected the stream to lambda. I expect the same result happened in the first sceario I mean two records in one payload shouuld be recieved in second lambda but suprisingly I recienve just one record in each payload and seems that stream split array and send it in different way for more clarification here is the payload I get:
{
"Records": [
{
"kinesis": {
"kinesisSchemaVersion": "1.0",
"partitionKey": "hh",
"sequenceNumber": "49580809756311348244603591366792053449767996205245136898",
"data": "eyJzSWQiOjQ0NDQ0LCJjSWQiOjExMjIyLCJjb3VudCI6MjAuMCwicm93VGltZTEiOiIyMDE4LTAxLTE2IDE0OjE2OjQwLjAwMSJ9",
"approximateArrivalTimestamp": 1516112247.8
},
"eventSource": "aws:kinesis",
"eventVersion": "1.0",
"eventID": "shardId-000000000000:49580809756311348244603591366792053449767996205245136898",
"eventName": "aws:kinesis:record",
"invokeIdentityArn": ".....lambda-img-resizer-role",
"awsRegion": "us-east-1",
"eventSourceARN": "arn:aws:kinesis:us-east-1:638417958056:stream/bni-tj-sbx22-stream-trending-stories-output"
}
]
}
As you see in the payload we have just one data attr. I am totally lost can anyone shed light on this?