URLSession AWS video upload with resigned URL really slow - amazon-web-services

I am using Pre-Signed url (generated from our server), to upload to S3 bucket. Using URLSession background session to upload from file to signed URL.
What I have noticed is, if the video is bigger (more than 30 or 50 MB), the upload is really slow. My internet speed is decent with close to 300 Mbps also did real time speed testing and it was coming to >10 MBPS download and upload.
Here is how I am creating session and upload task from file,
let sessionConfiguration : URLSessionConfiguration = URLSessionConfiguration.background(withIdentifier: "SOME_REVERSE_DOMAIN_STRING.backgroundSession")
sessionConfiguration.allowsCellularAccess = true
let backgroundSession: URLSession = URLSession(configuration: sessionConfiguration,delegate: self,delegateQueue:OperationQueue.main)
Upload Task, a basic usage nothing fancy here:
uploadsSession.uploadTask(with: request, fromFile: fileUrl!)
task.resume()
Should I use AWS SDK or Amplify framework and upload? will it make any difference.

To accelerate process you can use multipart upload. In your case without SDK you have to generate a signed URL request for each operation.
Next option is to use S3 Transfer Acceleration.

Related

Best practices of uploading a file to S3 and metadata to RDS?

Context
I'm building a mock service to learn AWS. I want a user to be able to upload a sound file (which other users can listen to). To do this I need the sound file to be uploaded to S3 and metadata such as file name, name of uploader, length, S3 ID to RDS. It is preferable that the user uploads directly to S3 with a signed URL instead of doubling the data transfered by first uploading it to my server and from there to S3.
Optimally this would be transactional but from what I have gathered there's no functionality for that given. In order to implement this and minimize the risk of the cases where the file being successfully uploaded to S3 but not the metadata to RDS and vice versa my best guess is as follows:
My solution
With words:
First is an attempt to upload the file to S3 with a key (uuid) I generate locally or server-side. If this is successful I make a request to my API to upload the metadata including the key to RDS. If this is unsuccessful I remove the object from S3.
With code:
uuid = get_uuid_from_server();
s3Client.putObject({.., key: uuid, ..}, function(err, data) {
if (err) {
reject(err);
} else {
resolve(data);
// Upload metadata to RDS through API-call to EC2 server. Remove s3 object with key:
uuid if this call is unsuccessful
}
});
As I'm learning, my approaches are seldom the best practices but I was unable to find any good information on this particular problem. Is my approach/solution above in line with best practices?
Bonus question: is it beneficial for security purposes to generate the file's key (uuid) server-side instead of client-side?
Here are 2 approaches that you can pick, assuming the client is a web browser or mobile app.
1. Use your server as a proxy to S3.
Your server acts as a proxy between your clients and S3, you have full control of the upload flow, control the supported file types and can inspect file contents, for example: to make sure the file is a correct sound file, before uploading to S3.
2. Use your server to create pre-signed upload URLs
In this approach, your client first requests server to create a single or multiple (for multi-part upload) pre-signed URLs. Clients then upload to your S3 using those URLs. Your server can save those URLs to keep track later.
To be notified when the upload finishes successfully or unsuccessfully, you can either
(1) Ask clients to call another API,e.g: /ack after the upload finishes for a particular signed URL. If this API is not called after some time, e.g: 1 hour, you can check with S3 and delete the file accordingly. You can do this because you have the signed URL stored in your DB at the start of the upload.
or
(2) Make use of S3 events. You can configure ObjectCreated event in S3, which is fired whenever an object is created, and send all the events to a queue in SQS, and have your server process each event from there. This way, you do not rely on clients to update your server after an upload finishes. S3 will notify your server accordingly, for all successful uploads.

URL expiration clarification for uploading file through S3 pre-singed URL

Lets assume we generate a pre-signed URL to upload a file with an expiration time of 15sec. And we start uploading a large file. Should the file upload be completed within 15sec of the URL generation or it can go beyond that if the file upload start within the 15sec time?
Upload action should start before the expiry time and there is no known restriction on time taken for completing the uploading after it starts. Since the S3 service evaluates the permissions for uploading the file while starting the upload action, it should not be affected by the time taken for actual uploading of the file.
In your case, considering the file size, if the upload fails for any reason then users wont be able to retry after 15 sec.
Below are more details on this point from "Uploading using Pre-signed urls" doc
That is, you must start the action before the expiration date and time. If the action consists of multiple steps, such as a multipart upload, all steps must be started before the expiration, otherwise you will receive an error when Amazon S3 attempts to start a step with an expired URL. ```

Amazon S3: Do not allow client to modify already uploaded images?

We are using S3 for our image upload process. We approve all the images that are uploaded on our website. The process is like:
Clients upload images on S3 from javascript at a given path. (using token)
Once, we get back the url from S3, we save the S3 path in our database with 'isApproved flag false' in photos table.
Once the image is approved through our executive, the images start displaying on our website.
The problem is that the user may change the image (to some obscene image) after the approval process through the token generated. Can we somehow stop users from modifying the images like this?
One temporary fix is to shorten the token lifetime interval i.e. 5 minutes and approve the images after that interval only.
I saw this but didn't help as versioning is also replacing the already uploaded image and moving previously uploaded image to new versioned path.
Any better solutions?
You should create a workflow around the uploaded images. The process would be:
The client uploads the image
This triggers an Amazon S3 event notification to you/your system
If you approve the image, move it to the public bucket that is serving your content
If you do not approve the image, delete it
This could be an automated process using an AWS Lambda function to update your database and flag photos for approval, or it could be done manually after receiving an email notification via Amazon SNS. The choice is up to you.
The benefit of this method is that nothing can be substituted once approved.

Regarding boto and aws. While uploading to s3 as a multipart file I am not able to get the result for get_all_multipart_uploads

I have a system where in we are performing upload of videos to aws by multi part upload. I have put this process as a work flow manager task. When the process completes I will update my database with the status of the payload as complete.
If the payload is not in completed status even after 24 hours I should delete the associated parts of the multipart upload from s3.
Now what all I have.
1. I have the video details(name)
2. I have the bucket in which I will be uploading the video.
When I perform the command bucket.get_all_multipart_uploads() I am not getting the asset which I have uploaded to the system, ie I dont find the name of the video which I had put on S3. I am pretty new to this. Can any one help me with proper documents and how to identify the uploads which hang on s3.

Can I recover lost information about an S3 multipart upload?

In this multipart upload example, one needs to save the upload ID and a set of etags corresponding to each uploaded part until the upload is "closed." If I lose my upload ID, I guess I can recover it by looking through open multipart uploads with ListMultipartUploads, but what if I lose an etag? Can those be recovered somehow, or must I abort the whole transfer and start over?
Once you have retrieved the upload ID from ListMultipartUploads, you can then use ListParts to get the list of parts (and their etags) that have been completed for this upload. You can use this information to then restart your upload from the last completed part.
Multipart Upload API and Permissions
Example of resuming multipart uploads using AWS SDK for iOS