AWS: Lambda + S3 setup - amazon-web-services

I need to configure a setup for an already existing lambda function setup and used with an S3 buckets in/out, kind of like the sample AWS pages provide here:
http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html
The Lambda function that I'm working with does the following:
- notices new image in S3 location mybucket-test/images/in
- creates a .json file at the same S3 location - mybucket-test/images/in
- creates variations of the image at S3 location mybucket-test/images/out
My task is about creating a second set of S3 in/out to work with the same lambda function. The AWS tutorial provided at the above link, doesn't mention how to set up the "resize" bucket.
After following the tutorials I ended up with this situation for the Lambda function:
- lambda notices new image in S3 location wire-qa/images/in
- creates .json file at the expected location wire-qa/images/in
- creates variations of the image in S3 location mybucket-test/images/out INSTEAD OF placing them into wire-qa/images/out
It seems to me there is something simple that I'm missing, but figuring out what it is on my own is already taking a lot of time.
I looked into all sorts of places to configure the Lambda function to use my bucket wire-qa/images/out, as well as mybucket-test/images/out. But I have not found so far anything relevant.
Would very much appreciate some help here. I would at least like to understand where in AWS example the bucket S3 "resize" is set up to work with Lambda.

Related

Run AWS lambda function on existing S3 images

I write an AWS lambda function in Node.js for image resizing and trigger it when images upload.
I have already more than 1,000,000 images existing in bucket.
I want to run this lambda function on that images but not find anything till yet.
How can I run AWS lamdba function on existing images of S3 bucket?
Note:- I know this question already asked on Stack overflow, but issue is that no solution of them given till yet
Unfortunately, Lambda cannot be triggered automatically for objects that are already existing in a S3 bucket.
You will have to invoke your Lambda function manually for each image in your S3 bucket.
First, you will need to list existing objects in your S3 bucket using the ListObjectsV2 action.
For each object in your S3 bucket, you must then invoke your Lambda function and provide the S3 object's information as the Payload.
Yes , it's completely true that lambda cannot be triggered by objects already present there in your s3 bucket, but invoking your lambda manually for each object is a completely dumb idea.
With some clever techniques you can perform your tasks on those images easily :
The hard way is, make a program locally that exactly does the same thing as your lambda function but add two more things, firstly you have to iterate over each object in your bucket, then perform your code on it and then save it to destination path of s3 after resizing. i.e, for all images already stored in your s3 bucket , instead of using lambda, you are resizing the images locally in your computer and saving them back to s3 destination.
The easiest way is, first make sure that you have configured s3 notification's event type to be Object Created (All) as trigger for your lambda.
Then after this, move all your already stored images to a new temporary bucket, and then move those images back to the original bucket, this is how your lambda will get triggered for each image automatically. You can do the moving task easily by using sdk's provided by AWS. For example, for moving files using boto3 in python, you can refer this link to moving example in python using boto3
Instead of using moving , i.e cut and paste , you can use copy and paste commands too.
In addition to Mausam Sharma's comment you can run the copy between buckets using the aws cli:
aws s3 sync s3://SOURCE-BUCKET-NAME s3://DESTINATION-BUCKET-NAME --source-region SOURCE-REGION-NAME --region DESTINATION-REGION-NAME
from here:
https://medium.com/tensult/copy-s3-bucket-objects-across-aws-accounts-e46c15c4b9e1
You can simply copy back to the same bucket with the CLI which will replace the original file with itself and then run the lambda as a result.
aws s3 copy s3://SOURCE-BUCKET-NAME s3://SOURCE-BUCKET-NAME --recursive
You can also include/exclude glob patterns which can be used to selectively run against say a particular day, or specific extensions etc.
aws s3 copy s3://SOURCE-BUCKET-NAME s3://SOURCE-BUCKET-NAME --recursive --exclude "*" --include "2020-01-15*"
It's worth noting that like many of the other answers here, this will incur costs on s3 for read/write etc, so cautiously apply this in the event of buckets containing lots of files.

Cloudformation Init calling S3 bucket

https://docs.aws.amazon.com/quickstart/latest/rd-gateway/step2.html#existing-standalone
https://s3.amazonaws.com/quickstart-reference/microsoft/rdgateway/latest/templates/rdgw-standalone.template
I'm referencing template above to create my Remote Desktop Gateway (RDGW) in existing VPC. It has QSS3BucketName, QSS3KeyPrefix in parameter section. Resources section has RDGWLaunchConfiguration references QSS3BucketName bucket again. on Setup files, it's calling the following path.
https://${QSS3BucketName}.${QSS3Region}.amazonaws.com/${QSS3KeyPrefix}submodules/quickstart-microsoft-utilities/scripts/Unzip-Archive.ps1
For some reason, PT30 (after30 mins it says it didn't get the required signal and rolls back. Question to the community is what do I need to store these files in the S3 buckets or templates would dump it in S3 while it's creating the stack.
I also created a bucket in S3, copied these scripts from GitHub and pasted inside the bucket in the correct order, still does not work. Kind frustrating.
The quick start template is referencing nested templates placed at aws-quickstart S3 bucket. If we place the default values in above URL, than the exact URL we get is,
https://aws-quickstart.s3.amazonaws.com/quickstart-microsoft-rdgateway/submodules/quickstart-microsoft-utilities/scripts/Unzip-Archive.ps1
You can create the stack with either default values without changing AWS quick start configuration or else you can download all the referencing templates, modify as per your requirement and place it in your own bucket. Once it is done than replace the URL value in the main template with your bucket's URL.

How to use AWS Elastic Transcoder on a bucket full of videos?

So I have an S3 bucket full of over 200GB of different videos. It would be very time consuming to manually set up jobs to transcode all of these.
How can I use either the web UI or aws cli to transcode all videos in this bucket at 1080p, replicating the same output path in a different bucket?
I also want any new videos added to the original bucket to be transcoded automatically immediately after upload.
I've seen some posts about Lambda functions, but I don't know anything about this.
A lambda function is just a temporary machine that runs some code.
The sample code in your link is what you are looking for as a solution. You can call your lambda function once for each item in the S3 bucket and kick off concurrent processing of the entire bucket.

Does AWS S3 Have a concept of files being 'updated'?

I'd like to write a Lambda function that is triggered when files are added or modified in an s3 bucket and processes them and moves them elsewhere, clobbering older versions of the files.
I'm wondering if AWS Lambda can be configured to trigger when files are updated?
After reviewing the Boto3 documentation for s3 it looks like the only things that could happen in a s3 bucket would be creations and deletions.
Additionally, the AWS documentation seems to indicate there is no way to trigger things on 'updates' to S3.
Am I correct in thinking there is no real concept of an 'update' to a file in S3 and that an update would actually be when something was destroyed and recreated? If I'm mistaken, how can I trigger a Lambda function when an S3 file is changed in a bucket?
No, there is no concept of updating a file on S3. A file on S3 is updated the same way it is uploaded in the first place - through a PUT object request. (Relevant answer here.) An S3 bucket notification configured to trigger on a PUT object request can execute a Lambda function.
There is now a new functionality for S3 buckets. Under properties there is the possibility to enable versioning for this bucket. And if you set a trigger for creating on S3 assigned to your Lambda function - this will executed every time if you 'update' the same file as it is a new version.

AWS Lambda and zip upload from S3

This feature is not clear to me about the benefits (I didn't find any good documentation):
Is it just faster in the case you reuse the same zip for many lambda functions because you upload only 1 time and you just give the S3 link URL to each lambda function?
If you use an S3 link, will all your lambda functions be updated with the latest code automatically when you re-upload the zip file, meaning is the zip file on S3 a "reference" to use at each call to a lambda function?
Thank you.
EDIT:
I have been asked "Why do you want the same code for multiple Lambda functions anyway?"
Because I use AWS Lambda with AWS API Gateway so I have 1 project with all my handlers which are actual "endpoints" for my RESTful API.
EDIT #2:
I confirm that uploading a modified version of the zip file on S3 doesn't change the existing lambda functions result.
If an AWS guy reads this message, that would be great to have a kind of batch update feature that updates a set of selected lambda functions with 1 zip file on S3 in 1 click (or even an "automatic update" feature that detects when the file has been updated ;-))
Let's say you have 50 handlers in 1 project, then you modify something global impacting all of them, currently you have to go through all your lambda functions and update the zip file manually...
The code is imported from the zip to Lambda. It is exactly the same as uploading the zip file through the Lambda console or API. However, if your Lambda function is big (they say >10MB), they recommend uploading to S3 and then using the S3 import functionality because that is more stable than directly uploading from the Lambda page. Other than that, there is no benefit.
So for question 1: no. Why do you want the same code for multiple Lambda functions anyway?
Question 2: If you overwrite the zip you will not update the Lambda function code.
To add to other people's use cases, having the ability to update a Lambda function from S3 is extremely useful within an automated deployment / CI process.
The instructions under New Deployment Options for AWS Lambda include a simple Lambda function that can be used to copy a ZIP file from S3 to Lambda itself, as well as instructions for triggering its execution when a new file is uploaded.
As an example of how easy this can make development and deployment, my current workflow is:
I update my Node lambda application on my local machine, and git commit it to a remote repository.
A Jenkins instance picks up the commit, pulls down the appropriate files, adds them into a ZIP file and uploads this to an S3 bucket.
The LambdaDeployment function then automatically deploys this new version for me, without me needing to even leave my development environment.
To answer what I think is the essence of your question, AWS allows you to use S3 as the origin for your Lambda zip file because sometimes uploading large files via your browser can timeout. Also, storing your code on S3 allows you to store it centrally, rather than on your computer and I'm sure there is a CodeCommit tie-in there as well.
Using the S3 method of uploading your code to Lambda also allows you to upload larger files (AWS has a 10MB limit when uploading via web browser).
#!/bin/bash
cd /your/workspace
#zips up the new code
zip -FSr yourzipfile.zip . -x *.git* *bin/\* *.zip
#Updates function code of lambda and pushes new zip file to s3bucket for cloudformation lambda:codeuri source
aws lambda update-function-code --function-name arn:aws:lambda:us-west-2:YOURID:function:YOURFUNCTIONNAME --zip-file file://yourzipfile.zip
Depends on aws-cli install and aws profile setup
aws --profile yourProfileName configure