I'm evaluating using GCP for my new project, however, I'm still trying to figure out how to implement the following feature and what kind of costs it will have.
TL;DR
What's the best strategy to serve user-uploaded media from GCP while giving users full control on who will be able to access them?
Feature Description
As an User, I want to upload some kind of media (eg: image, videos, etc...) in a private and secure way.
The media must be visible by me and by a specific subgroup of users to which I've granted access to.
Anybody else must not be able to access the media, even if he obtained the URL.
The media content would then be displayed on the website.
Dilemma
I would like to use Cloud Storage to store all the media, however, I'm struggling to find a suitable solution for the authorization part.
As far as I can tell, features related to "Access Control" are mostly tailored at Project and Organisational level.
The closest feature so far are Signed URLs, but this doesn't satisfy the requirement of not being able to access it even if you have the URL, even though it expires soon after and perhaps it could be a good compromise.
Another problem with this approach is that the media cannot be cached at the browser level, which could save quite some bandwidth in the long run...
Expensive Solution?
One solution that came to my mind, is that I could serve it through a GCE instance by putting an App there that validate a user, probably through a JWT, and then stream it back while using the appropriate cache headers.
This should satisfy all requirements, but I'm afraid about egress costs skyrocketing :(
Thank you to whoever will help!
Signed URLs are the solution you want.
Create a service account that represents your application. When a user of your application wants to upload an object, vend them a signed URL for performing the upload. The new object will be readable only by your service account (and other members of your project).
When a user wants to view an object, perform whatever checks you like and then vend them a signed URL for reading the object. Set a short expiration time if you are worried about the URLs being shared.
I would not advise the GCE-based approach unless you get some additional benefit out of it. I don't see how it adds any additional security to serve the data directly instead of via a signed URL.
Related
I am developing a LMS in Laravel and uploading all the video files to aws s3 bucket and can play them using video js player. But problem is, users can download the video files, which I want to stop. Can anybody suggest me is it possible or not? If possible, can anyone tell me how can I do that?
Objects in Amazon S3 are private by default.
However, if you wish students to make use of a file (eg a learning course video), you will need to grant access to the file. The best way to do this is by using Amazon S3 pre-signed URLs, which provide time-limited access to a private object.
For example, the flow would be:
A students logs into the LMS
A student requests access to a course
The LMS checks whether they are entitled to view the course (using your own business logic)
If they are permitted to use the course, the LMS generates a pre-signed URL using a few lines of code, and returns the link in a web page (eg via an <a> tag).
The student can access the content
Once the expiry duration has passed, the pre-signed URL no longer works
However, during the period where the student has access to the file, they can download it. This is because access has been granted to the object. This is necessary because the web browser needs access to the object.
The only way to avoid this would be to provide courseware on a 'streaming' basis, where there is a continuous connection between the frontend and backend. This is not likely to be how your LMS is designed.
Question: Say a user uploads highly confidential information. This is placed in a third party storage server. This third party bucket uses different authentication systems to the web application. What is the best practice for ensuring only the user or an admin staff member can access the file url?
More Context: A Django web application is running on Google App Engine Flexible. Google Storage is used to serve static and media files through Django. The highly confidential information is passports, legal contracts etc.
Static files are served in a fairly insecure way. The /static/ bucket is public, and files are served through django's static files system. This works because
there is no confidential or user information in any of our static
files, only stock images, css and javascript, and
the files are uglified and minifed before production.
For media files however, we need user specific permissions, if user A uploads an image, then user A can view it, staff can view it, but user B & unauthenticated users cannot under any circumstances view it. This includes if they have the url.
My preferred system would be, that GCP storage could use the same django authentication server, and so when a browser requested ...google.storage..../media/user_1/verification/passport.png, we could check what permissions this user had, compare it against the uploaded user ID, and decide whether to show a 403 or the actual file.
What is the industry standard / best practice solution for this issue?
Do I make both buckets only accessible to the application, using a service account, and ensure internally that the links are only shared if the correct user is viewing the page? (anyone for static, and {user or staff} for media?)
My questions, specifically (regarding web application security):
Is it safe to serve static files from a publicly readable bucket?
Is it okay to assume that if my application requests a file url, that this is from an authenticated user?
Specifically with regards to Django & GCP Storage, if 2 is false (I believe it is) how do I ensure that files served from buckets are
only visible to users with the correct permissions?
Yes, it is. Public readable buckets are made for that. Things like, CSS, the logo of you company or some files that have no sensible data are safe to share.
Of course, do not use the same Public bucket to store private/public stuff. Public with Public, Private with Private.
Here is the problem. When you say "authenticated user", to whom you want that user to be authenticated to?
For example, if you authenticate your user using any Django methods, then the user will be authenticated to Django, but for Cloud Storage it will be an stranger. Also, even a user authorized on GCP may not be authorized to a bucket on Cloud Storage.
The important thing here is that the one that communicates back and forth with Cloud Storage is not the User, its Django. It could achieve this by using the python SDK of Cloud Storage, which takes the credentials of the service account that is being used on the instance to authenticate any request to Cloud Storage. So, the service account that is running the VM (because you are in Flexible) is the one that should be authorized to Cloud Storage.
You must first authorize the user on Django and then check if the User is able to access this file by other means(Like storing the name of the file he uploaded in a user_uploaded_files table).
Regarding your first question at the top of the post, Cloud Storage lets you create signed urls. This urls allow anyone on the internet to upload/download files from Cloud Storage by just holding the url. So you only need to authorize the user on Django to obtain the signed url and that's it. He does not need to be "authorized" on Cloud Storage(because the url already does it)
Taken from the docs linked before:
When should you use a signed URL?
In some scenarios, you might not
want to require your users to have a Google account in order to access
Cloud Storage, but you still want to control access using your
application-specific logic. The typical way to address this use case
is to provide a signed URL to a user, which gives the user read,
write, or delete access to that resource for a limited time. Anyone
who knows the URL can access the resource until the URL expires. You
specify the expiration time in the query string to be signed.
Following on from Nahuel Varela's answer:
My system now consists of 4 buckets:
static
media
static-staging
media-staging
Both the static buckets are public, and the media buckets are only accessible to the app engine service account created within the project.
(The settings are different for dev / test)
I'm using the django-storages[google]with #elnygrens modification. I modified this to remove the url method for Media (so that we create signed URLS) but keep it in for static (so that we access the public URL of the static files).
The authentication of each file access is done in Django, and if the user passes the test (is_staff or id matches file id), then they're given access to the file for a given amount of time (currently 1 hour), this access refreshes when the page loads etc.
Follow up question: What is the best practice for this time limit, I've heard people use anywhere from 15mins to 24 hours?
I have a bunch of videos and all of them are uploaded on Wistia. On Wistia, I have set up access for my domain, so they will play only when the videos are fetched from my domain.
If someone uses View Source and copies the video URL and pastes it in a separate browser window, they get an "access denied' message.
I'm thinking about moving my videos to Google Cloud Storage. So, my questions are:
Does Google cloud provide a similar domain restriction feature?
How can I set this up? For now, I've created a temporary bucket and uploaded a video and granted it public access. Then I copied the public link of the MP4 file and added to my website, and it obviously plays, but then any paid member can use View Source, copy the MP4 link and upload it to other streaming services for everyone to see.
EDIT
Is there a way to do this programmatically - like my website is in PHP - so something along the lines like - keep the bucket as restricted access and then through PHP - pass some key and retrieve the video file. Not sure if something like this is possible.
Thanks
I do not believe that there is an access control mechanism in Google Cloud Storage equivalent to the one you are using in Wistia.
There are several methods to restrict object access (see https://cloud.google.com/storage/docs/access-control) in GCS, but none of them are based upon where the request came from. The only one that kind of addresses your issue is to use Signed URLs. Basically, a user would go to your site, but instead of giving them the "real" URL of the object they are going to be using, your application retrieves a special URL that is time-limited. You can set the length of time it is valid for.
But if what you are worried about is people copying your video, presumably they could still see the URL someplace and copy the data from there if they did it immediately, so I don't think that really solves your problem.
Sorry I can't be more helpful.
Good Day Everybody,
I'm fairly new to AWS, and I have a problem right now. I'm not even sure if this is something that is possible with S3. I tried googling it, but couldn't find any proper response (Probably because the keywords I searched doesn't make much sense ;) ).
So my problem is this, I have an node application which uploads user images to S3. I wan't to know how to properly access this images later in the front-end(Some sort of direct link). But at the same time, I should be able to restrict the users who can access the image. For eg: If user xyz uploads an image only that user should be able to see it. Another user say abc tries to open the direct link, it should say access restricited or something similar.
Or if that is not possible, atleast I should be able to put an encrypted timestamp on the get url, so that the image will be accessible through that particular url for only a limited amount of time.
Thanks in advance.
This is the typical use case for S3 Pre-signed URLs.
In S3, you are able to specify some query strings on the URL of your object that include an Access Key, an expiration timestamp and a signature. S3 validates the signature and checks if the request has been made before the expiration timestamp. If that's the case, it will serve the object. Otherwise, it will return an error.
The AWS SDK for JavaScript (Node.js) includes an example on how to generate pre-signed URLs: http://docs.aws.amazon.com/AWSJavaScriptSDK/guide/node-examples.html#Amazon_S3__Getting_a_pre-signed_URL_for_a_getObject_operation__getSignedUrl_
I have people uploading video content and I'd like to restrict the video content to ONLY be streamed from my site. Since the video URLs in the video tag are easily accessible through the HTML source, I was to stop people from copying the direct s3 url and putting it in another tab.
I was looking over the docs here: http://docs.aws.amazon.com/IAM/latest/UserGuide/AccessPolicyLanguage_ElementDescriptions.html#Condition
But it wasn't immediately obvious to me.
Thanks for your help!
You need to make this bucket private and use the signed URL to give access only to your users on your website. Signed URLs have short life (and required policy baked into it) when you generate them. This will prevent misuse even if somebody steals the URLs (or sends you the faked referrer headers etc).
You can create these URLs manually (difficult to manage) or programmatically (some coding work required). In the second case, once your website user contacts your server, then generate and serve the auto-expiring URL. Use this URL then on your website.
Overview of Signed URLs - Amazon CloudFront.