S3 putObject event - older version received - amazon-web-services

I am setting up a cloudwatch event to trigger on s3 put object and call a lambda function. I am able to trigger the function successfully and here is the sample code that I am trying to run.
exports.handler = function(event, context, callback) {
console.log("Incoming Event: ", event);
print("please");
const bucket = event.Records[0].s3.bucket.name;
const filename = decodeURIComponent(event.Records[0].s3.object.key.replace(/\+/g, ' '));
const message = `File is uploaded in - ${bucket} -> ${filename}`;
console.log(message);
callback(null, message);
};
I am getting an error as the event data does not contain the property "Records". I checked the AWS docs and the event data should contain "Records". The version shown in the documentation is "eventVersion":"2.2". In the event data I am getting the version as: eventVersion: '1.07'
Is there some additional configuration needed to make this work?
Here is what my cloudwatch event looks like:

You've configured CloudTrail API events. The format of those events is different to the event notifications generated from S3 (the docs you linked to).
If you go to the S3 bucket and apply an event trigger there, it will be in the format you expected. See Configuring Event Notifications.

Related

Invoke an AWS Lambda function only after an Amazon DynamoDB export to Amazon S3 is totally complete

I am new to AWS and cloud technology in general. So, please bear with me if the use case below is a trivial one.
Well, I have a table in Amazon DynamoDB which I am exporting to Amazon S3 using exportTableToPointInTime API (ExportsToS3) on a scheduled basis everyday at 6 AM. It is being done using an AWS Lambda function in this way -
const AWS = require("aws-sdk");
exports.handler = async (event) => {
const dynamodb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
const tableParams = {
S3Bucket: '<s3-bucket-name>',
TableArn: '<DynamoDB-Table-ARN>',
ExportFormat: 'DYNAMODB_JSON'
};
await dynamodb.exportTableToPointInTime(tableParams).promise();
};
The CFT template of the AWS Lambda function takes care of creating lambda roles and policies, etc. along with scheduling using Cloudwatch events. This setup works and the table is exported to the target Amazon S3 bucket everyday at the scheduled time.
Now, the next thing I want is that after the export to Amazon S3 is complete, I should be able to invoke an another lambda function and pass the export status to that lambda function which does some processing with it.
The problem I am facing is that the above lambda function finishes execution almost immediately with the exportTableToPointInTime call returning status as IN_PROGRESS.
I tried capturing the response of the above call like -
const exportResponse = await dynamodb.exportTableToPointInTime(tableParams).promise();
console.log(exportResponse);
Output of this is -
{
"ExportDescription": {
"ExportArn": "****",
"ExportStatus": "IN_PROGRESS",
"StartTime": "2021-09-20T16:51:52.147000+05:30",
"TableArn": "****",
"TableId": "****",
"ExportTime": "2021-09-20T16:51:52.147000+05:30",
"ClientToken": "****",
"S3Bucket": "****",
"S3SseAlgorithm": "AES256",
"ExportFormat": "DYNAMODB_JSON"
}
}
I am just obfuscating some values in the log with ****
As can be seen, the exportTableToPointInTime API call does not wait for the table to be exported completely. If it would have, it would have returned ExportStatus as either COMPLETED or FAILED.
Is there a way I can design the above use case to achieve my requirement - invoking an another lambda function only when the export is actually complete?
As of now, I have tried a brute force way to do it and which works but it definitely seems to be inefficient as it puts in a sleep there and also the lambda function is running for the entire duration of the export leading to cost impacts.
exports.handler = async (event) => {
const dynamodb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
const tableParams = {
S3Bucket: '<s3-bucket-name>',
TableArn: '<DynamoDB-Table-ARN>',
ExportFormat: 'DYNAMODB_JSON'
};
const exportResponse = await dynamodb.exportTableToPointInTime(tableParams).promise();
const exportArn = exportResponse.ExportDescription.ExportArn;
let exportStatus = exportResponse.ExportDescription.ExportStatus;
const sleep = (waitTimeInMs) => new Promise(resolve => setTimeout(resolve, waitTimeInMs));
do {
await sleep(60000); //waiting every 1 min and then calling listExports API
const listExports = await dynamodb.listExports().promise();
const filteredExports = listExports.ExportSummaries.filter(e => e.ExportArn == exportArn);
const currentExport = filteredExports[0];
exportStatus = currentExport.ExportStatus;
}
while (exportStatus == 'IN_PROGRESS');
var lambda = new AWS.Lambda();
var paramsForInvocation = {
FunctionName: 'another-lambda-function',
InvocationType: 'Event',
Payload: JSON.stringify({ 'ExportStatus': exportStatus })
};
await lambda.invoke(paramsForInvocation).promise();
};
What can be done to better it or the above solution is okay?
Thanks!!
One option to achieve this is to define a waiter in order to wait till a "Completed" status is returned from exportTableToPointInTime.
As far I can see there are a few default Waiters for DynamoDB already present, but there is not one for the export, so you'll need to write your own (you can use those already present as an example).
A good post describing how to use and write a waiter could be found here.
This way if the export takes less than 15 minutes you'll be able to catch it within the Lambda limits without the need of a secondary lambda.
If it takes longer than that, you'll need to decouple it, where you have multiple options as suggested by #Schepo and #wahmd:
using an S3 event on the other end
Using AWS EventBridge
Using SNS
combinations of the above.
Context: we want to export the DynamoDB table content into an S3 bucket and trigger a lambda when the export is complete.
In CloudTrail there's an ExportTableToPointInTime event that is sent when the export is started, but no event for when the export is finished.
A way to trigger a lambda once the export is completed is by creating an S3 trigger using this configuration:
In particular:
The creation event type is a complete multi-upload (others do not seem to work, not sure why).
I think the prefix can be omitted, but it's useful. It's composed of:
The first part is the table name, content.
The second part, AWSDynamoDB, is set automatically by the export tool.
This is the most important part. The last files created once the export is complete are manifest-summary.json and manifest-summary.md5. We must set the suffix as one of these files.
For an await call, you are missing "async" keyword on handler.
Change
exports.handler = (event) => {
to
exports.handler = async event => {
Since this is an await call, you need 'async' keyword with it.
Let me know if it fixed your issue.
Also, I suspect you don't need .promise() as it might be already returning promise. Anyways, please try with & without it incase it still doesn't work.
After dynamoDB await call, You can invoke another lambda. It would make sure that your lambda is invoked after dynamoDb export call is completed.
To invoke second lambda,
you can use aws sdk invoke package.
putEvent api using eventBridge.
Later option is better as it decouples both lambdas & also, first lambda does not have to wait until the seconds invocation is completed. (reduces lambda time, hence reduces cost)

Is it possible to make AWS Websocket + Lambda function to constant monitoring of the DynamoDB and send response to the client?

I have a serverless project: AWS + Angular on the frontend. Currently, I get the data when page is initialized and refresh the data when press "update" button. However, I want to monitor changes in the table constantly. In Firebase there is onSnapShot() method, which sends the new data when a collection is updated.
I want to make something similar with AWS. However, in official documentation, I do not see how to correctly do it.
So here are 2 questions:
How can I connect to the WebSocket with aws-sdk? (Currently, I can connect only from the terminal with wscat -c myurl call. Or shall I simply send http.Post with websocket url?
is it possible to pass invoke in the callback URL? - I want to get data from DynamoDB when page initialize and then invoke it again and again (with a callback URL)
My Lambda function looks like this:
exports.handler = async (event, context) => {
let params = {
TableName: "documents"
}
let respond = await db.scan(params).promise();
return respond;
};
On the front-end I have:
ngOnInit(): void {
AWS.config.credentials = new AWS.Credentials({
accessKeyId: '//mykey', secretAccessKey: '//mysecretkey'
})
AWS.config.update({
region:'//myregion'
})
this.updateTable() // triggers post request to APi Gateway => lambda and receives a response with data.
}
From my understanding, you will need to set up a DynamoDB stream and a lambda function that respond to the database CRUD events, send the updated data to the WebSocket connection if the event data matches the criteria (document id for example), through AWS.ApiGatewayManagementApi. (FYI: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/ApiGatewayManagementApi.html)

Google cloud storage function sending already delivered message when subscriber connects

I have a bucket in google storage cloud. Also, I have a storage function that gets triggered every time there is a new file/folder created on this bucket. The idea of this function is to publish on a google PubSub the files that were created under "monitoring" folder. So, it will get triggered once there is a new file, but only sending the message to PubSub if the file was created under the mentioned folder. Besides, I have a Java application subscribed to the PubSub receiving this messages. It is able to receive messages without issues at all, but when I shut down the application and lunch it again, after some minutes the messages that were delivered previously, are coming again. I checked the logs and see if the storage function was triggered, but it is not the case and it seems that no message was sent to PubSub again. All messages were Acked and PubSub was empty. Am I missing something related to storage function or PubSub ?
This is my storage function definition:
const {PubSub} = require('#google-cloud/pubsub');
const topicName = 'test-topic-1';
const monitoringFolder = 'monitoring/';
exports.handler = (event, context) => {
console.log(event);
if (isMonitoringFolder(event.name)) {
publishEvent(event);
}
};
const publishEvent = (event) => {
const pubSub = new PubSub();
const payload = {
bucket: event.bucket,
filePath: event.name,
timeCreated: event.timeCreated
};
const data = Buffer.from(JSON.stringify(payload));
pubSub
.topic(topicName)
.publish(data)
.then(id => console.log(`${payload.filePath} was added to pubSub with id: ${id}`))
.catch(err => console.log(err));
};
const isMonitoringFolder = filePath => filePath.search(monitoringFolder) != -1
I would really appreciate any advice
Pubsub doesn't guarantee a single occurance of the message
Google Cloud Pubsub have a At-Least-Once delivery policy. This delivers each published message at least once for every subscription. But it can be delivered multiple times
https://cloud.google.com/pubsub/docs/subscriber

How can I get log content in AWS Lambda from Cloudwatch

I have this basic lambda that posts an image to a web server.
From the events in CloudWatch, I can log successfully anything that happens in that lambda function :
From this Log Group (the lambda function) I clicked on Stream to AWS Lambda, chose a new lambda function in which I expect to receive my logs and didn't put any filters at all so I can get all logs.
The Lambda is triggered properly, but the thing is when I persist what I received in the event and context objects, I have all CloudWatch log stream information but I don't see any of the logs.
What I get :
Do I need to specify a filter for me to see any logs at all? Because in the filter section if I don't put any filters and click on test filter, I get all the logs in the preview window which seems to mean it should send the whole logs to my Lambda function. Also, it looked to me the logs where that unreadable stream in AWSLogs and that it was in Base64 but didn't get any results trying to convert that.
Yes the logs are gzipped and base64-encoded as mentioned by jarmod.
Sample code in NodeJs for extracting the same in lambda will be:
var zlib = require('zlib');
exports.handler = (input, context, callback) => {
var payload = new Buffer(input.awslogs.data, 'base64');
zlib.gunzip(payload, function(e, result) {
if (e) {
context.fail(e);
} else {
result = JSON.parse(result.toString());
console.log(result);
}
});

AWS Lambda and AWS API Gateway: How to send a binary file?

I have a lambda function which fetches a file from s3 using the input key in event and needs to send the same to client. I am using the following function to get the file from s3
function getObject(key){
var params = {
Bucket: "my_bucket",
Key: key
}
return new Promise(function (resolve, reject){
s3.getObject(params, function (err, data){
if(err){
reject(err);
}
resolve(data.Body)
})
})
}
If I send the response of this promise (buffer) to context.succeed, it is displayed as a JSON array on front end. How can I send it as a file ? The files can be either ZIP or HTTP Archive (HAR) files. The s3 keys contain the appropriate extension. I am guessing it has got something to do with the "Integration Response" in API Gateway. But not able to figure out where to change
Good news, you can now handle binary input and output for API Gateway (announcement and documentation).
Basically, nothing changes in your Lambda Function, but you can now set the contentHandling API Gateway Integration property to CONVERT_TO_BINARY.
Unfortunately, the official AWS examples showcase only the HTTP API Gateway backend, as the AWS Lambda support seems not complete yet. For example, I haven't managed to return gzipped content from AWS Lambda yet, although it should be possible thanks to the new binary support and the $util.base64Decode() mapping utility.