My RDS is currently working on AWS instance. There is a Lambda function for uploading and transcoding videos.
Can I change the transcoder to use my local storage instead of an Amazon S3 bucket?
If you are using AWS Elastic Transcoder service for transcoding, input file has to on S3. So, you have to upload them to S3. But if you are transcoding your files inside Lambda, a Lambda script can fetch your local server files over simple FTP(for example). But best practice is to upload them to S3 first. You can clean up your S3 files after you are done with them if your concern is storage cost.
Related
I’m hosting videos on my site via s3 buckets. I have a video that keeps buffering. The video is 4K and 6.5GB. Smaller videos shot in a lower resolution do not buffer. I’m having a hard time deciding whether it’s the video’s size in GB’s or 4K resolution that’s making it buffer. Anyone knows what makes a video buffer from a s3 bucket? Is it the size of the video or the resolution of the video? Also, does any know how to stop video buffering. Yes, I’ve already tried using cloud front but the same result.
Resolution
For large files, Amazon S3 might separate the file into multiple uploads to maximize the upload speed. The Amazon S3 console might time out during large uploads because of session timeouts. Instead of using the Amazon S3 console, try uploading the file using the AWS Command Line Interface (AWS CLI) or an AWS SDK.
Note: If you use the Amazon S3 console, the maximum file size for uploads is 160 GB. To upload a file that is larger than 160 GB, use the AWS CLI, AWS SDK, or Amazon S3 REST API.
AWS CLI
First, install and configure the AWS CLI. Be sure to configure the AWS CLI with the credentials of an AWS Identity and Access Management (IAM) user or role. The IAM user or role must have the correct permissions to access Amazon S3.
Important: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent AWS CLI version.
To upload a large file, run the cp command:
aws s3 cp cat.png s3://docexamplebucket
Note: The file must be in the same directory that you're running the command from.
When you run a high-level (aws s3) command such as aws s3 cp, Amazon S3 automatically performs a multipart upload for large objects. In a multipart upload, a large file is split into multiple parts and uploaded separately to Amazon S3. After all the parts are uploaded, Amazon S3 combines the parts into a single file. A multipart upload can result in faster uploads and lower chances of failure with large files.
For more information on multipart uploads, see How do I use the AWS CLI to perform a multipart upload of a file to Amazon S3?
AWS SDK
For a programmable approach to uploading large files, consider using an AWS SDK, such as the AWS SDK for Java. For an example operation, see Upload an object using the AWS SDK for Java.
Note: For a full list of AWS SDKs and programming toolkits for developing and managing applications, see Tools to build on AWS.
for details:
https://aws.amazon.com/premiumsupport/knowledge-center/s3-large-file-uploads/
I'm using logstash to process logs to our centralized logging and the inputs are at s3 in gz format. I need to create a cost projection regarding this process and does logstash download the s3 object or it parse it remotely?
Amazon S3 is an object storage service. Data is not "processed" on Amazon S3.
If an application wants to process data from Amazon S3, it would need to download the files from Amazon S3 and process the data locally. An exception to this is if the application uses the Amazon S3 Select service, which can query data directly on Amazon S3.
In terms of cost, if the Amazon EC2 instance is in the same region as the Amazon S3 bucket, there is no data transfer cost for downloading the data.
AWS s3 cli sync can use multipart upload option?
on-premise server sync to s3 using AWS cli sync
but, speed is very slow.
Assuming you mean the aws command (and not e.g. s3cmd): Yes, sync uses multipart upload by default. From the docs:
All high-level commands that involve uploading objects into an Amazon S3 bucket (aws s3 cp, aws s3 mv, and aws s3 sync) automatically perform a multipart upload when the object is large
I guess the slowness is caused by another factor, e.g. your bandwidth is low (check e.g. with speedtest or it is already saturated
As per my project requirement, I want to fetch some files from on-prem FTP server & put them into a S3 bucket. Files are of size 1-2 GB. Once the file will be put into the FTP server folder, I want that file to be uploaded to S3 bucket.
Please suggest the easiest way to achieve this?
Note- Mostly the files will be put into FTP server only once in a day, hence i dont want continuously scan the FTP server. once the files will be uploaded to S3 from FTP server, i want to terminate any resources (like EC2) created in AWS.
These are my ideas:
I think you could create an agent on your FTP server that will upload the files every N seconds/minutes/hours/Etc using the AWS CLI. This way you're avoiding external access to your FTP server.
Another approach is a Lambda function for pulling process, but like you said the FTP server doesn't allow external access.
Create a VPN between your on-prem and the cloud infra, create a Cloudwatch event and through a Lambda execute the pulling process.
Here you can configure a timeout:
Create a VPN between your on-prem and the cloud infra, from your FTP server upload the files using AWS CLI (pay attention to sync option). Take a look at this link: https://aws.amazon.com/answers/networking/accessing-vpc-endpoints-from-remote-networks/
With Jenkins create a task to execute a process that will upload the files.
You can use Storage gateway, visit its site here: https://aws.amazon.com/es/storagegateway/
Here is how we solved it.
Enable S3 acceleration on your S3 bucket. This is very much needed, since you are pushing large file.
If you have access to the server install aws cli and perform a sync on the folder to s3 bucket. AWS CLI will automatically sync your folder to bucket. This way if you change any of your existing files, it will keep in sync with S3 bucket. This is ideal and simplest way if you have access to the server and able to install aws cli.
https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration-examples.html#transfer-acceleration-examples-aws-cli
aws s3api put-bucket-accelerate-configuration --bucket bucketname --accelerate-configuration Status=Enabled
If you want to enable for specific or default profile,
aws configure set default.s3.use_accelerate_endpoint true
If you don't have access to ftp server in your premisis, you need an external server to perform this process. In this case you need to perform a poll or share file system, copy the file locally and move it to s3 bucket. There will be lot of failure points with this process.
Hope it helps.
I'm using AWS to host a static website. Unfortunately, it's very tedious to upload the directory to S3. Is there any way to streamline the process?
Have you considered using AWSCLI - AWS Command Line Interface to interact with AWS Services & resources.
Once you install and configure the AWSCLI; to update the site all that you need to do is
aws s3 sync s3://my-website-bucket /local/dev/site
This way you can continue developing the static site locally and a simple aws s3 sync command line call would automatically look at the files which have changed since the last sync and automatically uploads to S3 without any mess.
To make the newly created object public (if not done using Bucket Policy)
aws s3 sync s3://my-website-bucket /local/dev/site --acl public-read
The best part is, the multipart upload is built in. Additionally you sync back from S3 to local (the reverse)