How to use AWS Elastic Transcoder on a bucket full of videos? - amazon-web-services

So I have an S3 bucket full of over 200GB of different videos. It would be very time consuming to manually set up jobs to transcode all of these.
How can I use either the web UI or aws cli to transcode all videos in this bucket at 1080p, replicating the same output path in a different bucket?
I also want any new videos added to the original bucket to be transcoded automatically immediately after upload.
I've seen some posts about Lambda functions, but I don't know anything about this.

A lambda function is just a temporary machine that runs some code.
The sample code in your link is what you are looking for as a solution. You can call your lambda function once for each item in the S3 bucket and kick off concurrent processing of the entire bucket.

Related

AWS how to Trigger mediaconvert after video upload automatically

I am new to AWS. Most of example I have seen need an input file name from S3 bucket for media convert. I want to automate this process. What is the best way to do it. I want to achieve following.
API to upload a video(mp4) to a S3 bucket.
Trigger MediaConvert Job to process newly updated video and convert it to HLS.
I know how to create an API as well as MediaConvert job. What I need help with it is automating this workflow. How can I pass recently uploaded video to MediaConvert job dynamically?
I think this should actually cover what you're looking for, and is straight from the source:
https://aws.amazon.com/blogs/media/vod-automation-part-1-create-a-serverless-watchfolder-workflow-using-aws-elemental-mediaconvert/
Essentially, you'll be making use of AWS Lambda, a serverless code execution product. Lambda functions by allowing you to hook directly into "triggers" or events from within the AWS ecosystem (like uploading a file to S3).
The lambda can then execute code in a number of supported languages like Javascript or Python, which can be used to execute a MediaConvert job on the triggering object (the file uploaded to S3).

Edit image file in S3 bucket using AWS Lambda

Some images which is already uploaded on AWS S3 bucket and of course there is a lot of image. I want to edit and replace those images and I want to do it on AWS server, Here I want to use aws lambda.
I already can do my job from my local pc. But it takes a very long time. So I want to do it on server.
Is it possible?
Unfortunately directly editing file in S3 is not supported Check out the thread. To overcome the situation, you need to download the file locally in server/local machine, then edit it and re-upload it again to s3 bucket. Also you can enable versions
For node js you can use Jimp
For java: ImageIO
For python: Pillow
or you can use any technology to edit it and later upload it using aws-sdk.
For lambda function you can use serverless framework - https://serverless.com/
I have made youtube videos long back. This is related to how get started with aws-lambda and serverless
https://www.youtube.com/watch?v=uXZCNnzSMkI
You can trigger a Lambda using the AWS SDK.
Write a Lambda to process a single image and deploy it.
Then locally use the AWS SDK to list the images in the bucket and invoke the Lambda (asynchronously) for each file using invoke. I would also save somewhere which files have been processed so you can continue if something fails.
Note that the default limit for Lambda is 1000 concurrent executions, so to avoid reaching the limit you can send messages to an SQS queue (which then triggers the Lambda) or just retry when invoke throws an error.

How to automate AWS Elastic Transcoder Jobs for s3 buckets?

What I want: To add watermarks to all video files that are uploaded to the S3 bucket (mov, mp4, etc.). Then overwrite the file with it's same name with the newly transcoded file that has the watermark on it.
So, I was able to manually do this by creating a pipeline and job with elastic transcoder, but this is manual. I want this done the moment a file is uploaded to the server, overwrite the file with the new file and boom.
One, this should be a feature already but not sure why it isnt.
And two, How can I have this automatically done? Any advise? I know its possible just not sure exactly where to start here
You need S3 bucket, a lambda along with your transcoder pipeline.
Elastic transcoder is backbone of your process.
To automate transcoding, create lambda function which gets triggered by an S3 event .
More detailed explanation is here .

Identifying and deleting S3 Objects that are not being accessed?

I have recently joined a company that uses S3 Buckets for various different projects within AWS. I want to identify and potentially delete S3 Objects that are not being accessed (read and write), in an effort to reduce the cost of S3 in my AWS account.
I read this, which helped me to some extent.
Is there a way to find out which objects are being accessed and which are not?
There is no native way of doing this at the moment, so all the options are workarounds depending on your usecase.
You have a few options:
Tag each S3 Object (e.g. 2018-10-24). First turn on Object Level Logging for your S3 bucket. Set up CloudWatch Events for CloudTrail. The Tag could then be updated by a Lambda Function which runs on a CloudWatch Event, which is fired on a Get event. Then create a function that runs on a Scheduled CloudWatch Event to delete all objects with a date tag prior to today.
Query CloudTrail logs on, write a custom function to query the last access times from Object Level CloudTrail Logs. This could be done with Athena, or a direct query to S3.
Create a Separate Index, in something like DynamoDB, which you update in your application on read activities.
Use a Lifecycle Policy on the S3 Bucket / key prefix to archive or delete the objects after x days. This is based on upload time rather than last access time, so you could copy the object to itself to reset the timestamp and start the clock again.
No objects in Amazon S3 are required by other AWS services, but you might have configured services to use the files.
For example, you might be serving content through Amazon CloudFront, providing templates for AWS CloudFormation or transcoding videos that are stored in Amazon S3.
If you didn't create the files and you aren't knowingly using the files, can you probably delete them. But you would be the only person who would know whether they are necessary.
There is recent AWS blog post which I found very interesting and cost optimized approach to solve this problem.
Here is the description from AWS blog:
The S3 server access logs capture S3 object requests. These are generated and stored in the target S3 bucket.
An S3 inventory report is generated for the source bucket daily. It is written to the S3 inventory target bucket.
An Amazon EventBridge rule is configured that will initiate an AWS Lambda function once a day, or as desired.
The Lambda function initiates an S3 Batch Operation job to tag objects in the source bucket. These must be expired using the following logic:
Capture the number of days (x) configuration from the S3 Lifecycle configuration.
Run an Amazon Athena query that will get the list of objects from the S3 inventory report and server access logs. Create a delta list with objects that were created earlier than 'x' days, but not accessed during that time.
Write a manifest file with the list of these objects to an S3 bucket.
Create an S3 Batch operation job that will tag all objects in the manifest file with a tag of "delete=True".
The Lifecycle rule on the source S3 bucket will expire all objects that were created prior to 'x' days. They will have the tag given via the S3 batch operation of "delete=True".
Expiring Amazon S3 Objects Based on Last Accessed Date to Decrease Costs

Is there any service on AWS that can help me convert mp4 files to mp3?

I'm new to Amazon web services and I'm wondering if the platform offers any solution to convert media files to different formats ( mp4 to mp3) or do I have to use a lambda function with a third party library to achieve this.
Thank you !
You can get up and running quickly with Elastic Transcoder. You will need to:
create two s3 buckets, your 'inbox' and 'outbox'
add a transcoder pipeline specifying which bucket is your in/out buckets, and you what file types you want to transcode from and two.
you can set up a trigger so that every time something hits the in bucket, the process runs, or you can place something in the in bucket and use the sdk or cli to trigger a job.
Two things to note:
When you fire a job, you have to pass in the name of the file that will be created. If the file already exists in the out bucket, an error will be thrown.
As with all of aws' complete services, you get a little free up front, then it gets expensive. Once you get the hang of it, you can save some money rolling your own in lambda like this