Does Amazon Transcribe always require using S3 storage? - amazon-web-services

I am just getting started on looking at speech to text conversion. I want to transcribe mp3 files, but can convert them if needed. It looks as though the Google and the IBM offerings allow you to send a file and get a transcript back. However all the examples I see for Amazon require you to somehow put the file to be transcribed into S3 storage before conversion. Is that right or am I missing something? Can you just send a file to Amazon and get the transcription back without having to delve into S3?

The start_transcription_job() API call requires the input file to be in Amazon S3, in the same region as the Transcribe service being called.
It is also possible to Use Amazon Transcribe Streaming, which can perform real-time transcription. However, the sample code that has been provided is only in Java.
See: aws-samples/aws-transcribe-streaming-example-java: Example Java Application using AWS SDK creating streaming transcriptions via AWS Transcribe

Well, amazon uses s3 to perform the transcribe service and there is no way around it.
Use goolge or ibm one if you are worried about the calls from s3.. but i wont be amazed to see the same response times across all three services.

Related

Aws extract tar.gz on S3

As i'm New on aws and a little confused by all the similar services, I would like to have some leads and know if I am in the right direction.
I have tar.gz archives stored on aws s3 glacier deep archives. I would like that when requesting a restore, the archive is automatically extracted and the folders and files it contains put in s3 (with an expiration date).
these archives are too big to be extracted via lambda (300GB or more).
My idea would be to trigger a lambda function when the restore is complete and use that lambda function to start another aws service that does the extraction. I was thinking either aws batch or fargate. Which service do you think is the most suitable? For this kind of simple task it is preferable to use an arm architecture?
If someone has already done this before and has codes to share I'm interested (if not I'll try to put my final solution here for others).

AWS how to Trigger mediaconvert after video upload automatically

I am new to AWS. Most of example I have seen need an input file name from S3 bucket for media convert. I want to automate this process. What is the best way to do it. I want to achieve following.
API to upload a video(mp4) to a S3 bucket.
Trigger MediaConvert Job to process newly updated video and convert it to HLS.
I know how to create an API as well as MediaConvert job. What I need help with it is automating this workflow. How can I pass recently uploaded video to MediaConvert job dynamically?
I think this should actually cover what you're looking for, and is straight from the source:
https://aws.amazon.com/blogs/media/vod-automation-part-1-create-a-serverless-watchfolder-workflow-using-aws-elemental-mediaconvert/
Essentially, you'll be making use of AWS Lambda, a serverless code execution product. Lambda functions by allowing you to hook directly into "triggers" or events from within the AWS ecosystem (like uploading a file to S3).
The lambda can then execute code in a number of supported languages like Javascript or Python, which can be used to execute a MediaConvert job on the triggering object (the file uploaded to S3).

How to retrieve last added files programmatically from Amazon S3

I'm using the Agora SDK & RestAPI to recording life streams and upload them on AWS S3, the sdk comes with snapshoting feature that's Im planning to use for stream thumbnails.
Both Cloud Recording and Snapshot Recording are integrated with my app and works perfectly,
the Remaining problem is that the snapshots are named down to the microseconds
Agora snapshot services image file naming
From my overview, The services works as follow, Mobile App sends data to my Server, My server make requests to the Agora API so It joins the Life Stream channel and starts snapshoting and saving the images to AWS, so I suppose it's impossible to have time synchronization between AWS, AgoraRestAPI, my Server & my mobile app.
I've gone through their docs and I can't find anything about retrieving the file names.
I was thinking maybe I can have a Lambda Function that retrieves the last added file on a given bucket/folder? but due to my lack of knowledge on AWS and Lambda functions I don't know how's that or if it's possible.
any suggestions would be appreciated

Appropriate AWS Service for Media(video) Streaming

I have an application which streams video like(NetFlix, Youtube).
I am trying to host it in the AWS platform. I have found two different options with this:
first one is store video files in s3.
the second one is store video files in AWS MediaStore.
In my existing platform, I have a problem with downloading video through IDM by end users.
So, I have to prevent downloading the video from IDM.
How can I do this in the AWS platform? Which AWS service will suit my case of preventing downloading?
Please take note of data-out charge when you use AWS as the primary mean to serve your video streams. Personally I found It prohibitively expensive to use AWS's service to serve your video
Netflix for example use S3 as a part of main storage for their video streams.
To the question of which service you can use to hide direct link / download link from AWS. Currently there is no service provided natively by AWS for that purpose

Is there any service on AWS that can help me convert mp4 files to mp3?

I'm new to Amazon web services and I'm wondering if the platform offers any solution to convert media files to different formats ( mp4 to mp3) or do I have to use a lambda function with a third party library to achieve this.
Thank you !
You can get up and running quickly with Elastic Transcoder. You will need to:
create two s3 buckets, your 'inbox' and 'outbox'
add a transcoder pipeline specifying which bucket is your in/out buckets, and you what file types you want to transcode from and two.
you can set up a trigger so that every time something hits the in bucket, the process runs, or you can place something in the in bucket and use the sdk or cli to trigger a job.
Two things to note:
When you fire a job, you have to pass in the name of the file that will be created. If the file already exists in the out bucket, an error will be thrown.
As with all of aws' complete services, you get a little free up front, then it gets expensive. Once you get the hang of it, you can save some money rolling your own in lambda like this