amazon beanstalk, s3 and local files - amazon-web-services

I have a web application in PHP that accepts image uploads from a user interface (sent using some javascript). The PHP application processes the image and saves to disk several different versions and in different formats and resolutions.
Now I'm trying to integrate Amazon S3 into this application.
1) At which point do I actually save the file to S3?
2) Should I only do it at the end to store the final versions and in the meantime save temporary version on the EC2 instance server or should I never save to the EC2 instance?
3) One of my main worries is let's say the user uploads the file but does not press save which is the step that would actually store it to amazon s3, and let's say the load increases before save is pressed, is there a chance the user by the time he/she presses save could end up on a different instance where the local image does not exist?
amazon-web-services amazon-s3 amazon

This probably not the best solution, but at least it worth mentioning: you could mount S3 bucket as a local folder on your server (using RioFS, for example), and when file is ready to be uploaded to S3 - copy it to that folder, RioFS will automatically upload it to remote S3 bucket.

Related

Flutter upload files to AWS s3 faster with upload progress

am facing a problem while uploading one or more files i.e images/videos to AWS s3 bucket by using aws_s3_client plugin.
It's taking much time to upload a 10MB file
Not able to track the upload progress percentage
Not having option to upload multiple file at once (if same bucket)
Every time while uploading we have to verify the IM-User access. (since why cant we use single instance at once to verify and keep connection persistent/keep alive until application getting closed)
Hence, am not familiar with AWS services. So, suggest to me a best way to upload a file or multiple files to AWS s3 bucket with faster, with upload progress percentage, multiple file upload at once and persistent connection /Keep Alive verification.
For 1 and 2, use managed uploads, it provides an event to track upload progress and makes uploads faster by using multipart upload. Beware that multipart uploads only work for files having sizes from 5 MB to 5 TB.
For 3, AWS S3 does not allow uploading files having same names or keys in the same bucket. Depending on your requirement, you can turn on versioning in your bucket and that will save different versions of the same file.
For 4, you can generate and use pre-signed URLs. Pre-signed URLs have configurable timeouts that you can adjust depending on how long you want the link to be available for an upload.
Use multi part upload.multi part upload will upload files quickly to S3.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html

Work around for handling CPU Intensive task in aws ec2?

I have created a django application (running on aws ec2) which convert media file from one format to another format ,but during this process it consume CPU resource due to which I have to pay charges to aws.
I am trying to find a work around where my local pc (ubuntu) takes care of CPU intensive task and final result is uploaded to s3 bucket which I can share with user.
Solution :- One possible solution is that when user upload media file (html upload form) it goes to s3 bucket and at the same time via socket connection the s3 bucket file link is send to my ubuntu where it download file, process it and upload back to s3 bucket.
Could anyone please suggest me better solution as it seems to be not efficient.
Please note :- I have decent internet connection and computer which can handle backend very well but i not in state to pay throttle charges to aws.
Best solution for this is to create separate lambda function for this task. Trigger lambda whenever someone upload files on S3. Lambda will process files and store back to S3.

How shalL I expose S3 endpoint for clients to load files

I am running a project where in my clients upload the data file on my own server through SFTP.
Now the requirement is to move my application on cloud. So, I want those clients to upload those data file on my S3.
From design & security perspective, what are the approach or ways through which I can ask my clients to upload those files on S3? Shall I expose an application api (which will upload files to S3) to my clients or is there any other better & proper way to achieve this?
EDIT:
I would be uploading daily approx 200 files with each file of size approx 2-3 MB. These file uploads can't be scheduled, they are event driven. Our client SFTP the files as and when they need some processing of those files at our end.
If your clients are already using SFTP then you should consider simply migrating them to the managed SFTP service on AWS, which is part of AWS Transfer Family.
This will mean minimal change for your clients, and will allow you to shift their uploads directly into S3, which is ultimately where you want them to be.
If all your service does is upload to S3 , Use IAM Users/Policies to grant access to s3 bucket to your clients instead as your service will act only as a proxy and add extra maintenance and costs .
If the data that you store on S3 is very critical , I'd suggest you look at this
https://docs.aws.amazon.com/AmazonS3/latest/dev/security-best-practices.html#security-best-practices-prevent
However, there can be cases where you would want to expose an endpoint, lets say -
The client only requires the functionality to upload a file and no other operation. Here, the implementation is abstracted from the client and you can internally use or migrate to any other data store(be it s3) without affecting the clients. But consider this only if this is a possibility.

Uploading Directly to S3 vs Uploading Through EC2

Im developing a mobile app that will use AWS for its backend services. In the app I need to upload video files to S3 on a frequent basis, and I'm wondering what the recommended architecture would look like to make this scalable and efficient. Traffic could be high, and file sizes could be large.
-On one hand, I could upload directly to S3 using the S3 API on the client side. This would be the easiest option, but Im not sure of the negative implications associated with it.
-The other way to do it would be to go through an EC2 instance and handle the request using some PHP scripts and upload from there.
So my question is... Are these two options equal, or are there major drawbacks to one of them opposed to another? I will already have EC2 instances configured for database access if that makes any difference in how you approach the question.
I will recommend using "upload directly to S3 using the S3 API on the client side" as you can speed up the upload process by using AWS S3 part upload as your video files are going to large.
The second method will put extra CPU usage load on your EC2 instance as the script processing and upload to S3 will utilize CPU for the process.

EC2 and S3 image server

I'm creating an image upload service using EC2 and S3.
User uploads image to EC2 using PHP. EC2 uploads to S3 and then EC2 responds to user with the image link.
I was wondering how fast the upload between EC2 and S3 in the same region is.
Would it be better to store image temporarily on EC2, responds to user first and upload to S3 later or wait for the upload to finish before responding to user?
I was wondering how fast the upload between EC2 and S3 in the same region is
It's fast. Test it.
You should find that you can upload the image to S3 very quickly, then return the S3 URL to the client, where they'll immediately be able to fetch the image.
Caveat: if you are overwriting an S3 object at the same path, rather than creating a new object, there can be a delay after the time you upload the object before the new object is consistently returned for every request. This delay is unlikely, but possible, due to the eventual consistency model of S3. Deletes are the same way -- a deleted object may be still fetchable, briefly, before requests to S3 return 404 or 403.
See What is maximum Amazon S3 replication time on file upload? and note the change you should make to the endpoint if you're working in the US Standard (us-east-1) region to ensure immediate consistency.
It will be plenty fast; the latency between the user and your ec2 instance will be much bigger than the latency between ec2 and s3.
On the otherhand, if ec2 is not doing anything to the image before uploading it to s3, why not upload it directly to s3?