AWS Amplify Storage | Upload large file - amazon-web-services

Using AWS Amplify Storage, uploading a file to AWS S3 should be simple:
Storage.put(key, blob, options)
The above works without problem for smaller files, (no larger than around 4MB).
Uploading anything larger, ex. a 25MB video, does not work: Storage just freezes (app does not freeze, only Storage). No error is returned.
Question: How can I upload larger files using AWS Amplify Storage?
Side note: Described behaviour appears both on Android and iOS.

Amplify now automatically segments large files into 5Mb chunks and uploads them using the Amazon S3 Multipart upload process
https://aws.amazon.com/about-aws/whats-new/2021/10/aws-amplify-javascript-file-uploads-storage/
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html#mpu-process

After updating to
"aws-amplify": "ˆ4.3.11",
"aws-amplify-react-native": "^6.0.2"
uploads over 100MB are not freezing UI anymore + we also migrated to resumable uploads. When we used older version of aws-amplify": "^3.1.1", the problems like you mentioned were present.
Here is the pull request from Dec, 2021 for mentioned fixes:
https://github.com/aws-amplify/amplify-js/pull/8336
So the solution is really to upgrade AWS Amplify library.
However, this approach works only on iOS.
Uploading big media files on Android results in network error when calling fetch (as a required step before calling Storage.put method).
Although the same method can perfectly work on the web, in React Native uploading big files was/is not implemented optimally (taking in mind, that we should load all file in memory using fetch()).

Related

How to set no cache AT ALL on AWS S3?

I started to use AWS S3 to provide a fast way to my users download the installation files of my Win32 apps. Each install file has about 60MB and the download it's working very fast.
However when i upload a new version of the app, S3 keeps serving the old file instead ! I just rename the old file and upload the new version with the same name of the old. After i upload, when i try to download, the old version is downloaded instead.
I searched for some solutions and here is what i tried :
Edited all TTL values on cloudfrond to 0
Edited the metadata 'Cache-control' with the value 'max-age=0' for each file on the bucket
None of these fixed the issue, AWS keeps serving the old file instead of the new !
Often i will upload new versions, so i need that when the users try to download, S3 never use cache at all.
Please help.
I think this behavior might be because S3 uses an eventually consistent model, meaning that updates and deletes will propagate eventually but it is not guaranteed that this happens immediately, or even within a specific amount of time. (see here for the specifics of their consistency approach). Specifically, they say "Amazon S3 offers eventual consistency for overwrite PUTS and DELETES in all Regions" and I think the case you're describing would be an overwrite PUT. There appears to be a good answer on a similar issue here: How long does it take for AWS S3 to save and load an item? which touches on the consistency issue and how to get around it, hopefully that's helpful

AWS S3 C++: Should I use UploadFile() or PutObject() for uploading a file? Where are the differences? [duplicate]

In the aws-sdk's S3 class, what is the difference between upload() and putObject()? They seem to do the same thing. Why might I prefer one over the other?
The advantage to using AWS SDK upload() over putObject() is as below:
If the reported MD5 upon upload completion does not match, it
retries.
If the file size is large enough, it uses multipart upload to upload
parts in parallel.
Retry based on the client's retry settings.
You can use for Progress reporting.
Sets the ContentType based on file extension if you do not provide
it.
upload() allows you to control how your object is uploaded. For example you can define concurrency and part size.
From their docs:
Uploads an arbitrarily sized buffer, blob, or stream, using intelligent concurrent handling of parts if the payload is large enough.
One specific benefit I've discovered is that upload() will accept a stream without a content length defined whereas putObject() does not.
This was useful as I had an API endpoint that allowed users to upload a file. The framework delivered the file to my controller in the form of a readable stream without a content length. Instead of having to measure the file size, all I had to do was pass it straight through to the upload() call.
When looking for the same information, I came across: https://aws.amazon.com/blogs/developer/uploading-files-to-amazon-s3/
This source is a little dated (referencing instead upload_file() and put() -- or maybe it is the Ruby SDK?), but it looks like the putObject() is intended for smaller objects than the upload().
It recommends upload() and specifies why:
This is the recommended method of using the SDK to upload files to a
bucket. Using this approach has the following benefits:
Manages multipart uploads for objects larger than 15MB.
Correctly opens files in binary mode to avoid encoding issues.
Uses multiple threads for uploading parts of large objects in parallel.
Then covers the putObject() operation:
For smaller objects, you may choose to use #put instead.
EDIT: I was having problems with the .abort() operation on my .upload() and found this helpful: abort/stop amazon aws s3 upload, aws sdk javascript
Now my various other events from https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Request.html are firing as well! With .upload() I only had 'httpUploadProgress'.
This question was asked almost six years ago and I stumbled across it while searching for information on the latest AWS Node.js SDK (V3). While V2 of the SDK supports the "upload" and "putObject" functions, the V3 SDK only supports "Put Object" functionality as "PutObjectCommand". The ability to upload in parts is supported as "UploadPartCommand" and "UploadPartCopyCommand" but the standalone "upload" function available in V2 is not and there is no "UploadCommand" function.
So if you migrate to the V3 SDK, you will need to migrate to Put Object. Get Object is also different in V3. A Buffer is no longer returned and instead a readable stream or a Blob. So if you got the data through "Body.toString()" you now have to implement a stream reader or handle Blob's.
EDIT:
the upload command can be found in the AWS Node.js SDK (V3) under #aws-sdk/lib-storage. here is a direct link: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/modules/_aws_sdk_lib_storage.html

Difference between upload() and putObject() for uploading a file to S3?

In the aws-sdk's S3 class, what is the difference between upload() and putObject()? They seem to do the same thing. Why might I prefer one over the other?
The advantage to using AWS SDK upload() over putObject() is as below:
If the reported MD5 upon upload completion does not match, it
retries.
If the file size is large enough, it uses multipart upload to upload
parts in parallel.
Retry based on the client's retry settings.
You can use for Progress reporting.
Sets the ContentType based on file extension if you do not provide
it.
upload() allows you to control how your object is uploaded. For example you can define concurrency and part size.
From their docs:
Uploads an arbitrarily sized buffer, blob, or stream, using intelligent concurrent handling of parts if the payload is large enough.
One specific benefit I've discovered is that upload() will accept a stream without a content length defined whereas putObject() does not.
This was useful as I had an API endpoint that allowed users to upload a file. The framework delivered the file to my controller in the form of a readable stream without a content length. Instead of having to measure the file size, all I had to do was pass it straight through to the upload() call.
When looking for the same information, I came across: https://aws.amazon.com/blogs/developer/uploading-files-to-amazon-s3/
This source is a little dated (referencing instead upload_file() and put() -- or maybe it is the Ruby SDK?), but it looks like the putObject() is intended for smaller objects than the upload().
It recommends upload() and specifies why:
This is the recommended method of using the SDK to upload files to a
bucket. Using this approach has the following benefits:
Manages multipart uploads for objects larger than 15MB.
Correctly opens files in binary mode to avoid encoding issues.
Uses multiple threads for uploading parts of large objects in parallel.
Then covers the putObject() operation:
For smaller objects, you may choose to use #put instead.
EDIT: I was having problems with the .abort() operation on my .upload() and found this helpful: abort/stop amazon aws s3 upload, aws sdk javascript
Now my various other events from https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/Request.html are firing as well! With .upload() I only had 'httpUploadProgress'.
This question was asked almost six years ago and I stumbled across it while searching for information on the latest AWS Node.js SDK (V3). While V2 of the SDK supports the "upload" and "putObject" functions, the V3 SDK only supports "Put Object" functionality as "PutObjectCommand". The ability to upload in parts is supported as "UploadPartCommand" and "UploadPartCopyCommand" but the standalone "upload" function available in V2 is not and there is no "UploadCommand" function.
So if you migrate to the V3 SDK, you will need to migrate to Put Object. Get Object is also different in V3. A Buffer is no longer returned and instead a readable stream or a Blob. So if you got the data through "Body.toString()" you now have to implement a stream reader or handle Blob's.
EDIT:
the upload command can be found in the AWS Node.js SDK (V3) under #aws-sdk/lib-storage. here is a direct link: https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/modules/_aws_sdk_lib_storage.html

Media files on Heroku

If I host a small Django website on Heroku and I am using just one dyno, is it save to upload media files on that server, or should I necessarily use AWS S3 storage to store media files? What are other alternatives for media storage?
No, it is never safe to store things on the Heroku filesystem. Even though you only have one dyno, it is still ephemeral, and can be killed at any time; for example when you push new code.
Using S3 is the way to go (alternatives are the Azure and Google offerings). There are several other advantages for using S3, mostly ability to service files without stressing your small server.
While your site is small, a dyno is very small as well, so a major advantage of S3, if used correctly, is that you can have the backing of the AWS S3 infrastructure to service the files. By "used correctly", I mean that you want to upload and service files directly to/from S3 so your server is only used for signing the S3 urls, but the actual files never go through your server.
Check https://devcenter.heroku.com/articles/s3-upload-python and http://docs.fineuploader.com/quickstart/01-getting-started.html (I strongly recommend Fine-Uploader if you can use the free version or afford the small license fee.).
Obviously, you can also just implement S3 media files in django using django-storage-redux, but that that means your server will be busy uploading files. If that's ok for your small server, then it is ok too.

AWS S3 file uploads: PHP SDK vs REST API

I need to upload a file to AWS Simple Storage Service from a PHP script. The script gets called from an external program and for some unknown reason the script bombs out as soon as I load the AWS PHP SDK. I've tried everything to get it to work without any success. I'm therefore thinking of rather using the AWS S3 REST API to upload the file.
My question is, what is the major drawback of using the REST API compared to the PHP SDK? I know it will be a bit harder to use the REST APIs, but if I only need to upload files to S3, would it take significantly more time? Or would it be worth spending another half a day (hopefully) trying to get the script to run while using the SDK?