What is the recommended way to handle large file uploads to s3?

What is the recommended way to handle large file uploads to s3? - amazon-web-services

I'm using AWS SDK for Ruby to upload large files from users to s3.
The server is a sinatra app with a POST /images endpoint accepting multipart/form-data. I'm experiencing a noticeable delay with user uploads. This is to be expected, because it's making a request to s3 synchronously. I wanted to move this to a background job using something like Sidekiq, but I'm not sure I like that solution.
I read online that some people are promoting direct uploads to s3 on the client side. Some even called this a "best practice." I'm hesitant to do this for several reasons:
My client side code would be heavily tied down to my cloud provider. I love AWS (great experiences), but I like to remain somewhat cloud-agnostic. I don't want my mobile and web apps to have to know the details of my AWS setup. If I choose to move away from s3 at a later date (unlikely but plausible), I would want this to be a seamless transition. Obviously, this works ok for a web app, because I can always redeploy quickly. However, I have to worry about mobile. Users may not update, and everything will become a lot more complicated if some users are uploading to s3 and some are uploading to another service.
Business logic regarding determining which bucket and region to use would need to either exist on the client side or I'd need to expose an endpoint for determining which bucket and region to use for each user. Then, I'd have to make a request to my server to figure out the parameters before I can begin uploading to s3. I want to be able to change buckets or re-route users to alternative regions and so I'm not a fan of this tight coupling or the additional request.
Security is a huge concern. When files are uploaded and processed through my server, I can utilize AWS IAM to properly ensure that these files are only coming from my server. I believe that I have to grant an "all-write" privilege to users which is problematic. If I use AWS IAM credentials in JavaScript, I do not see how you can ensure that users do not get unlimited write access to my bucket. All client side javascript, can be read by a user. In addition, I'm unaware of how to process validations. On my server, I can scan the files and determine whether or not to upload to s3. If I upload directly from the client, I would have to move this processing into lambda functions. I'm ok with that, but there is a chance the object could be retrieved by users before the processing has occurred. Then, I'd have to build some sort of locking system to prevent access before processing.
So, the bottom line is I have no idea where to go from here. I've hacked around some solutions, but I'm not thrilled with any of them. I'd love to learn how other startups and enterprises are tackling this kind of problem. What would you recommend? How would you counter my argument? Forgive me if I'm missing something, I'm still relatively an AWS-newbie.

If you're worried about changing the post service I would suggest using an API and that way you can change the backed storage for your service. The mobile or web client would call the service and then your api would place the file where it needed to go. The api you have more control over and you could just created a signed s3 url to send to the client and let them still do the uploading.
An api, like in 1, solves this problem too, the client doesn't have to do all the work.
Use Simple Token Services and Temporary Security Credentials.

I agree with strongjz, you should use an API to upload your files from the server side.
Cloudinary provides an API for uploading images and videos to the cloud.
From what I know from my experience in using Cloudinary it is the right solution for you.
All your images, videos and required metadata are stored and managed by Cloudinary in Amazon S3 buckets owned by Cloudinary.
The default maximum file size limit for videos is 40MB. This can be customized for paid plans.
For example in Ruby:
Cloudinary::Uploader.upload("sample_spreadsheet.xls", :resource_type =>
:raw)

Related

How to correctly upload photos/files in a Django + Angular decoupled application to S3?

I have a decoupled applications build with Django 2, an API with DRF and an Angular 6 frontend application. I want to enable users to upload photos for their profiles, and probably in the future some pdfs, and after some research I figured out that the most convenient thing to do would be to store these files in an Amazon S3 bucket.
I have found numerous resources about how to upload files to an S3 bucket on both, Angular and Django, and now I was wondering what would be the best approach to do this in my decoupled application: should I manage it on the frontend and not use my backend at all? or should I pass the file from my angular to my Django app and then upload it to the bucket from there?
Some pros and cons of both approaches? It's my first time doing this and I haven't been able to find many resources for decoupled applications.
Any help is welcome! thanks!

The best practice
Anything related to data must be managed by your backend i.e Django, Angular is just a client.
You should pass the file from angular to Django app and then upload it
to the s3 bucket from there
Cons using client
Suppose in future, if you will develop mobile apps to consume your rest APIs then you need to rewrite the whole management there also.
You have to keep your s3 bucket API keys on the client and it is easily accessible to hackers.
dist folder size will be going to increase that will affect the load time of your site
If you are uploading files from client to S3 bucket, it will use the user's internet and in most of the cases it is slower than your sever's internet

How to securely use Amazon S3 in a messaging application

So I'm building a messaging app in Cordova and I was wondering what the best approach is to secure the image files so no one else can view them. I suppose I can just generate random filenames and store them in the database, but that feels like pseudo-security. I also know that you can createPresignedRequest(), but that's for temporary files I believe. Maybe I'm missing something, but I can't figure out a good way to do this. I'm also using the PHP SDK. Not too important for scenario, but figured I'd mention it.

I also know that you can createPresignedRequest(), but that's for temporary files I believe.
Pre-signed links are temporary, but it doesn't matter if the object in S3 is.
You can either use pre-signed URLs or Amazon Cognito in combination with AWS IAM roles to grant certain users access to the files.
How it would work with Cognito is described on the following page: https://docs.aws.amazon.com/cognito/latest/developerguide/iam-roles.html

Sinch Framework - Uploading Call Records in S3

Does anybody have information on how to make sinch framework upload the voice call recordings to AWS s3?
I've created an IAM user on AWS for this, but could not find where to set the AWS credentials so that Sinch uploads the call recording automatically. Is it done on the client side, i.e. IOS code, or done by Sinch team manually? Do we need to change anything on the client side for this behaviour?
Please let me know if you have any information regarding this.
Kind Regards,
Engin

It cannot be set yourself. To do so, send an email to support#sinch.com.

Hiding AWS secret from application

I'm a Java backend engineer working on a feature that the frontend (SPA and Android) must send (large) files to S3. Since I have to manage with a lot of requests. Because of network overload reasons I'm avoiding to make a 'proxy' service where the frontend send me the file so that I can send it to S3 but I have some concern about the best way to keep my apps secure.
I looked for some solutions but I cannot find one that manages exactly what I want.
Amazon S3 upload with not showing secret key in frontend
This post has almost my answer but I don't have enough score to comment.
S3 upload directly in JavaScript
I read some documentation on AWS but I still have some questions and some requisites.
The solution may permit the client an authenticated user to send a file to s3 directly
It may make a GET call to get some token or something like that (without sending a lot of data)
It's to be secure (no secret key knowledge at the frontend)
Which solution may be good for me?
The backend may generate a signing key and send it to frontend making the request to AWS (http://docs.aws.amazon.com/general/latest/gr/signature-v4-examples.html)
I can use STS to generate a temporary credential for each upload.
Do you think these approach will work? Which one do you think is better? What are the trade offs? Is there other way to deal with this problem?

Best thing to do here is use the Cognito service to generate anonymous credentials in the app that allow an upload to S3. For Android you can use the SDK then to do multi-part uploads from the device to S3, which will speed up the process as well.
I couldn't find an exact Android example, but this is one for iOS and the terminology should transfer the same, just with the other SDK: iOSTransferManager .
You can also call Cognito directly from javascript, if you have a web based app: Cognito in JS example
Hope that helps!
- Chris

A scheme for expiring downloaded content?

I am going to offer a web API service that allows users to download and "rent" content for a monthly subscription fee. The API will either be open to everyone or possibly just select parties (not sure yet). Each developer must agree to a license, and they receive a developer key for their person. Each software application will have its own key as well. So then end-users will download the software which will interact with my service's API. Each user will have a key for each application as well (probably using OAuth).
Content will be cached on first download and accessible offline via just the third-party application that cached the content.
If a user cancels their subscription, I plan on doing the following:
Deactivate the user's OAuth key for all applications.
Do not allow the user's account to download new content via the API (and subsequently any software that uses the API).
Now, the big question is: how do I make content expire if they cancel their subscription? If they cancel, they should not have access to content anymore. Here are ideas I've thought of (some of these are half-solutions, not yet fully fleshed out):
Require that applications encrypt downloaded content using the user's OAuth key, making it available to only the application. This will prevent most users from going to the cache directory and just copying and keeping files.
Update the user's key once a month, forcing content to re-cache on a monthly basic. Users could then access content for a month after they cancel their subscription.
Require applications to "phone home" [to the service] periodically and check whether the user's subscription has terminated. If so, require in the API developer license that applications expire cache. If it is found that applications do not comply, their keys (and possibly keys for all developers) are permanently deactivated as a consequence.
One major worry is that some applications may blatantly ignore constraints of the license. Is it generally acceptable to rely on applications abiding by the licensing constraints? Bad idea?
Any other ideas? Maybe a way to make content auto-expire after x days? Something else? I'm open to out-of-the-box ideas.

If you want to control the usage of content, you need to be in control of the access point. Most applications that implement this scheme ship a server or client product that provides access to the content.
I'm assuming your architecture is returning data, pure and simple. If I'm a developer using your web service, what is to prevent me from caching all the responses in static files elsewhere at query time? Nothing, because your access point is your web server. You have no control over my usage of the content once it departs from the access point.
Unless the downloaded content requires a callback to your server when being consumed, you're out of luck with this strategy.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js