Upload direct to S3 or via EC2?

Upload direct to S3 or via EC2? - django

I would like to build a web service for an iPhone app. As for file uploads, I'm wondering what the standard procedure and most cost-effective solution is. As far as I can see, there are two possibilities:
Client > S3: I upload a file from the iPhone to S3 directly (with the AWS SDK)
Client > EC2 > S3: I upload a file to my server (EC2 running Django) and then the server uploads the file to S3 (as detailed in this post)
I'm not planning on modifying the file in any way. I only need to tell the database to add an entry. So if I were to upload a file Client > S3, I'd need to connect to the server anyways in order to do the database entry.
It seems as if EC2 > S3 doesn't cost anything as long as the two are in the same region.
I'd be interested to hear what the advantages and disadvantages are before I start implementing file uploads.

I would definitely do it through S3 for scalability reasons. True, data between S3 and EC2 is fast and cheap, but uploads are long running, not like normal web requests. Therefore you may saturate the NIC on your EC2 instance.
Rather, return a GUID to the client, upload to S3 with the key set to the GUID and Content-Type set appropriately. Then call a web service/Ajax endpoint to create a DB record with the GUID key after the upload completes.

Related

Low upload speed to ec2 instance running on another region

I have a few EC2 instances (t2.micro) behind a load balancer on the us-east-1 region (N. Virginia) and my users are accessing the application from South America. This is my current setup mainly because costs are about 50% of what I would pay for the same services here in Brasil.
My uploads all go to S3 buckets, also in the us-east-1 region.
When a user requests a file from my app, I check for permission because the buckets are not public (hence why I need all data to go through EC2 instances) and I stream the file from S3 to the user. The download speeds for the users are fine and usually reach the maximum the user connection can handle, since I have transfer acceleration enabled for my buckets.
My issue is uploading files through the EC2 instances. The upload speeds suffer a lot and, in this case, having transfer acceleration enabled on S3 does not help in any way. It feels like I'm being throttled by AWS, because the maximum speed is capped around 1Mb/s.
I could maybe transfer files directly from the user to S3, then update my databases, but that would introduce a few issues to my main workflow.
So, I have two questions:
1) Is it normal for upload speeds to EC2 instances to suffer like that?
2) What options do I have, other than moving all services to South America, closer to my users?
Thanks in advance!

There is no need to 'stream' data from Amazon S3 via an Amazon EC2 instance. Nor is there any need to 'upload' via Amazon EC2.
Instead, you should be using Pre-signed URLs. These are URLs that grant time-limited access to upload to, or download from, Amazon S3.
The way it works is:
Your application verifies whether the user is permitted to upload/download a file
The application then generates a Pre-signed URL with an expiry time (eg 5 minutes)
The application supplied the URL to the client (eg a mobile app) or includes it in an HTML page (as a link for downloads or as a form for uploads)
The user then uploads/downloads the file directly to Amazon S3
The result is a highly scalable system because your EC2 system does not need to be involved in the actual data transfer.
See:
Share an Object with Others - Amazon Simple Storage Service
Uploading Objects Using Pre-Signed URLs - Amazon Simple Storage Service

Simplest way to fetch the file from FTP server (on-prem) & put into S3 bucket

As per my project requirement, I want to fetch some files from on-prem FTP server & put them into a S3 bucket. Files are of size 1-2 GB. Once the file will be put into the FTP server folder, I want that file to be uploaded to S3 bucket.
Please suggest the easiest way to achieve this?
Note- Mostly the files will be put into FTP server only once in a day, hence i dont want continuously scan the FTP server. once the files will be uploaded to S3 from FTP server, i want to terminate any resources (like EC2) created in AWS.

These are my ideas:
I think you could create an agent on your FTP server that will upload the files every N seconds/minutes/hours/Etc using the AWS CLI. This way you're avoiding external access to your FTP server.
Another approach is a Lambda function for pulling process, but like you said the FTP server doesn't allow external access.
Create a VPN between your on-prem and the cloud infra, create a Cloudwatch event and through a Lambda execute the pulling process.
Here you can configure a timeout:
Create a VPN between your on-prem and the cloud infra, from your FTP server upload the files using AWS CLI (pay attention to sync option). Take a look at this link: https://aws.amazon.com/answers/networking/accessing-vpc-endpoints-from-remote-networks/
With Jenkins create a task to execute a process that will upload the files.
You can use Storage gateway, visit its site here: https://aws.amazon.com/es/storagegateway/

Here is how we solved it.
Enable S3 acceleration on your S3 bucket. This is very much needed, since you are pushing large file.
If you have access to the server install aws cli and perform a sync on the folder to s3 bucket. AWS CLI will automatically sync your folder to bucket. This way if you change any of your existing files, it will keep in sync with S3 bucket. This is ideal and simplest way if you have access to the server and able to install aws cli.
https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration-examples.html#transfer-acceleration-examples-aws-cli
aws s3api put-bucket-accelerate-configuration --bucket bucketname --accelerate-configuration Status=Enabled
If you want to enable for specific or default profile,
aws configure set default.s3.use_accelerate_endpoint true
If you don't have access to ftp server in your premisis, you need an external server to perform this process. In this case you need to perform a poll or share file system, copy the file locally and move it to s3 bucket. There will be lot of failure points with this process.
Hope it helps.

AWS service for website

I have website which I wanted to host on AWS. My website stores the data in back end in RDBMS/MongoDB and uses PHP/Javascript/python and etc...
My website will be receiving data from users and I will be using it for analysis. I want to do any installation.
Which is the best for my requirement AWS s3 or AES E2?

S3 is just file storage, you can't run dynamic applications (PHP/Python/etc.) or databases on S3. You probably need to run your application on EC2, your database on either EC2 or RDS, and store your application's static files on S3.

S3 keys from a QT/C++ client

We have an application that is written in C++/QT that sits on client machines all around the world and the client uploads files to Amazon S3, what is the best way to authenticate with Amazon without actually including the Amazon key on every client, is there any way to generate a unique key for each client (1000s of potential clients)?
Would it make more sense to send everything to an intermediate server or proxy and then from there upload the files to Amazon S3?

The optimum way to implement this is to follow the below steps:
Step1: Allow your clients to upload files to intermediate server.
Step2: Store your AWS s3 bucket user credentials in the intermediate server file or database.
Step3: Upload the files from intermediate server to the Amazon s3 bucket.
Hope it helps....

File upload API on EC2 with ELB and S3

I am developing app server with NodeJS and AWS.
I am setting up the server environment with ELB and EC2s.
I am using ELB as load balancer and attached several app server EC2 instances to it.
And one EC2 instance is used for MongoDB.
My question is about request including file upload.
I think uploaded file should not be in app server (EC2 instance), so I will try to save uploaded files in S3 and allow app servers (EC2 instances) to access it.
The rough solution is that if app servers accept a file from client, move it to S3 and delete the file on the app server.
But then it will cause some performance loss and I don't feel it's a clean way.
Is this a best way? or there is another way to solve it.
I think it's best way to upload file to S3.
But file is uploaded with other data. (For example, profile upload - name: String, age: Number, profileImage: File)
I need to process other data on app server, so client should not upload to S3 directly.
Is there any better idea?
Please save me.
P.S: Please let me know if you cannot understand my expression because I am not native. If so, I will add some explanation for it with my best!

You can directly upload to S3 using temporary credentials that allow the end user to write to you bucket.
There is a good article with detailed code for doing exactly what you are trying to do with node.js here.
Answers that refer to external links are frowned upon on SO, so in a nutshell:
include the aws sdk in your application
provide it with appropriate credentials
use those credentials to generate a signed URL with a short lifespan
provide the end user with the signed URL they can then use to upload, preferably asynchronously with progress feedback

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Upload direct to S3 or via EC2? - django

Related

Low upload speed to ec2 instance running on another region

Simplest way to fetch the file from FTP server (on-prem) & put into S3 bucket

AWS service for website

S3 keys from a QT/C++ client

File upload API on EC2 with ELB and S3

Categories

Resources