AWS Configure Kinesis Stream with DynamoDB Lambda - amazon-web-services

From following this question, AWS DynamoDB Stream into Redshift
DynamoDB --> DynamoDBStreams --> Lambda Function --> Kinesis Firehose --> Redshift.
How do I configure my Kinesis function to pick up the Lambda function source?
I created a DynamoDB table (Purchase Sales), and Added DynamoDB Streams. Then I configured the Lambda function to pickup the DynamoDB Stream. My question is how do I configure Kinesis to pick up the Lambda function Source? I know how to configure Lambda Transformation, however would like to pick up as Source. Not sure how to configure the Direct Put Source below.
Thanks,
Performed these steps:

In your case, you would stream the dynamodb to redshift
DynamoDB --> DynamoDBStreams --> Lambda Function --> Kinesis Firehose --> Redshift.
First, you need a lambda function handle the DynamoDBStream. For each DynamoDBStream event, use firehose PutRecord API to send the data to firehose. From the example
var firehose = new AWS.Firehose();
firehose.putRecord({
DeliveryStreamName: 'STRING_VALUE', /* required */
Record: { /* required */
Data: new Buffer('...') || 'STRING_VALUE' /* Strings will be Base-64 encoded on your behalf */ /* required */
}
}, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});
Next, we have to know how the data being insert into the RedShift. From the firehose document,
For data delivery to Amazon Redshift, Kinesis Firehose first delivers
incoming data to your S3 bucket in the format described earlier.
Kinesis Firehose then issues an Amazon Redshift COPY command to load
the data from your S3 bucket to your Amazon Redshift cluster.
So, we should know what data format to let the COPY command map the data into RedShift schema. We have to follow the data format requirement for redshift COPY command.
By default, the COPY command expects the source data to be
character-delimited UTF-8 text. The default delimiter is a pipe
character ( | ).
So, you could program the lambda which input dynamodb stream event, transform it to pipe (|) separated line record, and write it to firehose.
var firehose = new AWS.Firehose();
firehose.putRecord({
DeliveryStreamName: 'YOUR_FIREHOSE_NAME',
Record: { /* required */
Data: "RED_SHIFT_COLUMN_1_DATA|RED_SHIFT_COLUMN_2_DATA\n"
}
}, function(err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});
remember to add \n as the firehose will not append new line for you.

Related

DynamoDB Trigger Lambda Function Call Failed

I am trying to have events in a DynamoDB table trigger Lambda function that moves the events into Kinesis Data Firehose. Kinesis then batches the files and send them to an S3 bucket. The Lambda function I am using as the trigger fails.
This is the Lambda code for the trigger:
```
import json
import boto3
firehose_client = boto3.client('firehose')
def lambda_handler(event, context):
resultString = ""
for record in event['Records']:
parsedRecord = parseRawRecord(record['dynamodb'])
resultString = resultString + json.dumps(parsedRecord) + "\n"
print(resultString)
response = firehose_client.put_record(
DeliveryStreamName="OrdersAuditFirehose",
Record={
'Data': resultString
}
)
def parseRawRecord(record):
result = {}
result["orderId"] = record['NewImage']['orderId']['S']
result["state"] = record['NewImage']['state']['S']
result["lastUpdatedDate"] = record['NewImage']['lastUpdatedDate']['N']
return result
```
Edit: Cloudwatch Log
The goal is to get the lambda function to move events to Kinesis triggered by events in DynamoDB
Edit2: Cloudwatch
I'm going to post this as my initial answer, and will edit when you return with the exception from your Lambda Logs.
Edit
The issue is that that you are looking for a key in a dict which does not exist
result["lastUpdatedDate"]
lastUpdatedDate is not inside result. It may be useful to check the contents of the dict by logging it to your logs.
print(result)
There is no need to use Lambda when you want integration between DynamoDB and Firehose. Instead of DynamoDB Streams you can use Kinesis Data Streams which will integrate directly with Firehose without the need for extra code.
DynamoDB -> Kinesis Stream -> Kinesis Firehose -> S3
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/kds.html
If you really want to use DynamoDB Streams then you can also avoid the Lambda code by using EventBridge Pipes
DynamoDB -> EventBridge Pipe -> Kinesis Firehose -> S3
https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-pipes.html#pipes-targets
Both of the above solutions result in no-code delivery of DynamoDB events to Firehose.

how to publish result of a Lambda Function to a cross-account Kinesis Stream

Say I have two accounts 111111111111 and 222222222222 and want to do following.
(Lambda) -> (Kinesis)
111111111111 222222222222
Where Lambda function is a trigger for a data source (could be another Kinesis stream in 111111111111).
exports.handler = async (event, context) => {
// data transformed here
const result = event.records.map(record => {});
return {data: result};
}
I am trying to format the data in 111111111111's Lambda Function and then send it to 22222222222's Kinesis stream, but I couldn't find many resources on this.
I came across this SO post. IAM role aside, it seems like each invocation of the Lambda Function needs to create a session with 22222222222 account and creates a Kinesis instance in order to call PutRecord. This looks like a red flag to me as I was thinking Lambda function could just set up a cross-account destination with resourceArn to send its result data to. What am I missing and is there better alternate to doing this?
This looks like a red flag to me as I was thinking Lambda function could just set up a cross-account destination with resourceArn to send its result data to.
This is not a red flag. The cross-account IAM roles is how it is done for kinesis, because kinesis streams don't have resource-based policies. So you have to assume IAM role from account 2, in your lambda.
I'm not sure which resourceArn are you referring to. The only one I can think of is resourceArn for Kinesis Data Analytics. This does not apply to Kinesis Data Streams.

How to copy state from my react-native app to my AWS Kinesis Firehose stream

With the necessary imports & configuration it is quite simple to use Storage.put from aws amplify to record text in this.state.appended to S3 mybucket using a function...
doComprehend = () => {
Storage.put('apptext.txt', this.state.appended)
.then (result => {
console.log('result: ', result)
})
.catch(err => console.log('error: ', err));
}
How would I alter this function so that it sends the text in this.state.appended to an AWS Firehose stream instead?
Context of my problem: my RN app sends text to an aws bucket as the apptext.txt file, that triggers a lambda that calls aws comprehend and comprehend medical and it all works but I can't append the results of the lambda to the apptext.txt file inside the S3 bucket because that's not possible...
I want to change it so that the text I'm interested in is first put into a firehose stream, transformed there by the lambda and then saved to the S3 bucket.
I've looked at the Amplify docs. I've looked at aws-sdk-js. I tried npm firehoser. I can't figure out how to do it.

AWS Kinesis S3 Redshift

I am trying to send data to Amazon Redshift through Kinesis via S3 using the copy command. The data format is ok in S3 in JSON, but when it reaches Redshift, lots of the column values are empty. I have tried everything and haven't been able to fix this for the past few days.
Data format in s3:
{"name":"abc", "clientID":"ahdjxjxjcnbddnn"}
{"name":"def", "clientID":"uhrbdkdndbdbnn"}
Redshift table structure:
User(
name: varchar(20)
clientID: varchar(25)
)
In Redshift I get only one of the two fields populated.
I have used JSON auto in the copy command.

Sending data to kinesis stream (in different AWS account) using lambda function

I have a lambda function that writes to a kinesis stream. But now, I want to write to a kinesis stream which belongs to a different AWS account. Assuming I have all the necessary cross account permissions, how can I send data to this stream? How should I change the parameters when I call the kinesis constructor or the putRecord function?
There is the method above which would technically work, however hardcoding creds or even configuring creds into a lambda seems a bit extraneous to me since lambdas themselves require that you have a role. What you need to do is create a cross account trust and assume role using sts.
Create a role in the account with the kinesis stream, and set it to trust your lambda role.
Give that role a policy that allows it to put to the kinesis stream.
In your lambda code use sts to create a session in the account with the kinesis stream and put your record.
Note your lambda will need a policy that allows it to sts into the second account's role.
It is described a bit more clearly here Providing Access to Accounts you Own
First you need to configure the Kinesis instance:
(I chose Javascript for the example)
var kinesis = new AWS.Kinesis({
accessKeyId: 'XXX',
secretAccessKey: 'YYY',
region: 'eu-west-1',
apiVersion: '2013-12-02'
});
For more informations take a look Constructing a Kinesis object
To write/put a record use the following
var params = {
Data: new Buffer('...') || 'STRING_VALUE', /* required */
PartitionKey: 'STRING_VALUE', /* required */
StreamName: 'STRING_VALUE', /* required */
ExplicitHashKey: 'STRING_VALUE',
SequenceNumberForOrdering: 'STRING_VALUE'
};
kinesis.putRecord(params, function (err, data) {
if (err) console.log(err, err.stack); // an error occurred
else console.log(data); // successful response
});
For more informations take a look Calling the putRecord operation