I am trying to build a file upload and download app using AWS APi gateway,AWS lambda and S3 for storage.
AWS lambda puts a cap of 6 mb on the file size and API gateway a limit of 10 mb.
Therefore we decided to use pre sign url for uploading n downloading files.
Step 1- Client sends the list of filename(let's say 5 files) to lambda.
Step 2- Lamda creates and returns the list of pre sign url(PUT) for those files(5 urls).
Step 3- Client uploads the file to S3 using the urls which it received.
Note - The filename are S3 bucket keys.
Similar approach with downloading file .
Now the issue is with the latency, it takes quite a long time and performance is the collateral damage.
The question is, the above approach the only way to do file upload n download in lambda.
It looks like the case of S3 Transfer Acceleration. You'll still create pre-signed URLs but enable this special setting in S3 which will reduce latency.
https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration.html
Alternatively you can use CloudFront with S3 origin to upload / download file. Might have to re-architect your solution but with CloudFront and AWS networking backbone, latency can be reduced a lot.
Related
I'd like to be able to detect when all parts of an S3 multipart upload have been uploaded.
Context
I'm working on a Backend application that sits between a Frontend and an S3 bucket.
When the Frontend needs to upload a large file it makes a call to the Backend (step 1). The latter initiates a multipart upload in S3, generates a bunch of presinged URLs, and hands them to the Frontend (steps 2 - 5). The Frontend uploads segment data directly to S3 (steps 6, 10).
S3 multipart uploads need to be explicitly completed. One obvious way to perform it would be to make another call from the Frontend to the Backend to notify about the fact that all parts have been uploaded. But if possible I'd like to avoid that extra call.
A possible solution: S3 Event Notifications
I have S3 Event Notifications enabled on the S3 bucket so whenever something happens, it notifies an SNS topic which in turn calls the Backend.
If the bucket sent S3 notifications after each part is done uploading, I could use those in the Backend to see if it's time to complete the upload (steps 7 - 9, 11 - 14).
But although some folks claim (one, two) that it's the case, I wasn't able to reproduce it.
For proof of concept, I used this guide from Amazon to upload a file using aws s3api create-multipart-upload, several aws s3api upload-part, and aws s3api complete-multipart-upload. I would expect to get a notification after each upload-part, but I only got a single "s3:ObjectCreated:CompleteMultipartUpload" after, well, complete-multipart-upload.
My bucket is configured to send notification for all object creation events: "s3:ObjectCreated:*".
Questions
Is it possible to somehow instruct S3 to send notifications upon upload of each part?
Are there any other mechanisms to find out in the Backend that all parts have been uploaded?
Maybe what I want is complete nonsense and even if there was a way to implement it, it would bring significant drawbacks?
Setup:
We are running a E-commerce website consists of Cloudfront-->ALB-->EC2. we are serving the images from S3 via cloudfront behaviour.
Issue:
Our admin URL is like example.com/admin. We are uploading product images via admin panel as a zip file that goes via cloudfront.Each zip file size around 100MB-150MB consists of around 100 images. While uploading the zip file we are facing 502 gateway error from cloudfront since it took more than 30sec, which is default time out value for cloudfront.
Expected solution:
Is there a way we can skip the cloudfront for only uploading images?
Is there any alternate way increasing timeout value for cloudfront??
Note: Any recommended solutions are highly appreciated
CloudFront is a CDN service to help you speed up your services by caching your static files in edge location. So it won't help you in uploading side
In my opinion, for the uploading images feature, you should use the AWS SDK to connect directly with S3.
If you want to upload files directly to s3 from the client, I can highly suggest using s3 presigned URLs.
You create an endpoint in your API to create the presigned URL for a certain object (myUpload.zip), pass it back to the client and use that URL to do the upload. It's safe, and you won't have to expose any credentials for uploading. Make sure to set the expiration time to a reasonable time (one hour).
More on presigned URLs's here https://aws.amazon.com/blogs/developer/generate-presigned-url-modular-aws-sdk-javascript/
Tried uploading a file of size ~220 MB to s3. I tried doing this through the aws console and it took a lot of time. The upload speed was around 500Kbps on average. I know it isn't a bottleneck because of my network because I'm able to upload this same file to google drive console in about 47seconds.
I've tried uploading to the same directory through aws s3 cli and it is much faster ~2 minutes. I was wondering if there is any issue with doing uploads directly on s3 console. I'm also thinking this would be a risk, because I want my application to be able to upload to s3 using a signed url, but that is taking a similar amount of time to the console upload time.
Google Drive upload: 49 seconds
S3 console upload: REALLY SLOW (>10 minutes before I gave up).
AWS cli (no custom settings): ~ 2 minutes.
Upload through my UI: (similar to s3 console upload time).
You should be using S3 Multipart API for uploading large files to S3.
The Multipart upload API enables you to upload large objects in parts.
You can use this API to upload new large objects or make a copy of an
existing object.
The reason why your CLI upload is quicker because it internally uses the multipart API for big objects automatically.
The recommended method is to use aws s3 commands (such as aws s3 cp)
for multipart uploads and downloads, because these aws s3 commands
automatically perform multipart uploading and downloading based on the
file size.
Source : https://docs.aws.amazon.com/AmazonS3/latest/dev/qfacts.html
I am working on uploading image file in AWS S3 bucket by using putObject method in lambda.
https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
But putObject is taking more than 20 second to upload 5 MB image file and
all of my resources are hosted in same region so implementing accelerate endpoint is also not making any difference.
http://s3-accelerate-speedtest.s3-accelerate.amazonaws.com/en/accelerate-speed-comparsion.html?region=REGION-NAME&origBucketName=BUCKET-NAME
Is this the expected time to upload file or is there any other way to accelerate the uploading time ?
We created VPC endpoint for S3 service and now there is some good improvement in performance while uploading file.
I have to upload video files into an S3 bucket from my React web application. I am currently developing a simple react application and from this application, I am trying to upload video files into an S3 bucket so I have decided two approaches for implementing the uploading part.
1) Amazon EC2 instance: From the front-end, I am hitting the API and the server is running in the Amazon EC2 instance. So I can upload the files into S3 bucket from the ec2 instance.
2) Amazon API Gateway + Lambda: I am directly sending the local files into an S3 bucket through API + Lambda function by calling the https URL with data.
But I am not happy with these two methods because both are more costly. I have to upload files into an S3 bucket, and the files are more than 200MB. I don't know I can optimize this uploading process. Video uploading part is necessary for my application and I should be very careful to do this part and also I have to increase the performance and cost-effective.
If someone knows any solution please share with me, I will be very helpful for me to continue my process.
Thanks in advance.
you can directly upload files from your react app to s3 using aws javascript sdk and cognito identity pools and for the optimization part you can use AWS multipart upload capability to upload file in multiple parts I'm providing links to read about it further
AWS javascript upload image example
cognito identity pools
multipart upload to S3
also consider a look at aws managed upload made for javascript sdk
aws managed upload javascript
In order to bypass the EC2, you can use a pre-authenticated POST request to directly upload you content from the browser to the S3 bucket.