I want to do the following:
Have a single bucket
Have multiple users be able to add/read/access objects with a specific project folder prefix
Not allow other users to access objects they don't belong to
So for example, if you have a project with id 1, multiple users can create objects under it:
user_1 created 1/image_1.jpg
user_2 read 1/image_1.jpg
user_2 created 1/image_2.jpg
However, users who don't belong to the "project", can't:
NOT ALLOWED user_3 read 1/image_1.jpg
Everything I've found online revolves around each user having their own folder by creating an IAM role which only allows access to objects that are prefixed with the user's id. That approach creates user folders, I want project folders.
The typical architecture is:
When an application wants to display a private object, or provide a link to a private object, it generates a Pre-signed URL.
This pre-signed URL provides time-limited access to a private object.
Users can use the link to view/download the object. For example, it might be used in an <img> tag to display a picture, or in a <a> tag to provide a link.
When a user wants to upload an object, then can Upload Objects Using Presigned URLs. This can control where the object is uploaded, the type of file, maximum size, etc.
This way, the application has full control over which objects the user an upload/download, which gives much more fine-grained control than having to create IAM rules for every combination of user, project, folder, object, etc. The pre-signed URL can be used to directly access S3, but only to do what the application has authorized.
Related
I have several Rise 360 courses that I have exported to web apps and added them to my S3 bucket. I want to know the best was that I can sell access to these web apps from my website which I have built on the WordPress platform. I currently have 10 web apps in one bucket.
I don't want people to be able to take the URL and post it somewhere.
Content in Amazon S3 is private by default. Access is only available if you grant access in some way.
A good way to grant access to private content is to use Amazon S3 pre-signed URLs. These grant temporary access to private objects.
The flow would work something like this:
A user purchases a course
They then access a "My Courses" page
When generating that page, the PHP code would consult a database to determine what courses they have purchased
For each course they are allowed to access, the PHP code will generate a pre-signed URL to the course in Amazon S3. The URL can be configured to provide access for a period of time, such as 30 minutes
The user follows that URL and access the course. (Note: This assumes that only a single object is accessed.)
Once the expiry time is passed, the object is no longer accessible. The user would need to return to the "My Courses" page and click a newly-generated link to access the course again
If a user extracts the URL from the page, they will be able to download the object. You say "I don't want people to be able to take the URL and post it somewhere." This is not possible to guarantee because the app is granting them access to the object. However, that access will be time-limited so if they share the URL, it will stop working after a while.
If your app requires access to more than one URL (eg if the first page refers to a second page), then this method will not work. Instead, users will need to access the content via your app, with the app checking their access every time rather than allowing users to access the content directly from S3.
I am developing a LMS in Laravel and uploading all the video files to aws s3 bucket and can play them using video js player. But problem is, users can download the video files, which I want to stop. Can anybody suggest me is it possible or not? If possible, can anyone tell me how can I do that?
Objects in Amazon S3 are private by default.
However, if you wish students to make use of a file (eg a learning course video), you will need to grant access to the file. The best way to do this is by using Amazon S3 pre-signed URLs, which provide time-limited access to a private object.
For example, the flow would be:
A students logs into the LMS
A student requests access to a course
The LMS checks whether they are entitled to view the course (using your own business logic)
If they are permitted to use the course, the LMS generates a pre-signed URL using a few lines of code, and returns the link in a web page (eg via an <a> tag).
The student can access the content
Once the expiry duration has passed, the pre-signed URL no longer works
However, during the period where the student has access to the file, they can download it. This is because access has been granted to the object. This is necessary because the web browser needs access to the object.
The only way to avoid this would be to provide courseware on a 'streaming' basis, where there is a continuous connection between the frontend and backend. This is not likely to be how your LMS is designed.
In my app user upload images and I add watermark on it.Both are stored in different folder.
I want that original images only shown to those users who upload it and private to others.
In simple public for owner who upload it and private for rest .
I cant find any relevant bucket policy for it.
Is we can do like it?
If the data belong to the specific user, for me the rule of thumb is to keep the data private.
Never keep user data in s3 public, A single script can find the pattern of an object name and anyone can access the image data of any buddy.
If the images are some sort of assets then it's fine to make them public, but the rule of thumb "User data in S3 should be private"
Here is guideline to how to make data secure in S3 and also you should read user Data policy or declare user policy for your App.
I want that original images only shown to those users who upload it
and private to others.
The best option is presighned URL, generate a pre-signed URL for accessing to an object, you can set the time limit too, it means after that time, the URL not work and expired.
Here is the flow diagram.
You can read this slide.
amazon-s3-bucket-file-download-through-presigned-timebound-urls
save the object in such way that its name contain user metadata or the object contain metadata of user, save the file name in DB, during user request cross-check the metadata and generate presigned URL.
Question: Say a user uploads highly confidential information. This is placed in a third party storage server. This third party bucket uses different authentication systems to the web application. What is the best practice for ensuring only the user or an admin staff member can access the file url?
More Context: A Django web application is running on Google App Engine Flexible. Google Storage is used to serve static and media files through Django. The highly confidential information is passports, legal contracts etc.
Static files are served in a fairly insecure way. The /static/ bucket is public, and files are served through django's static files system. This works because
there is no confidential or user information in any of our static
files, only stock images, css and javascript, and
the files are uglified and minifed before production.
For media files however, we need user specific permissions, if user A uploads an image, then user A can view it, staff can view it, but user B & unauthenticated users cannot under any circumstances view it. This includes if they have the url.
My preferred system would be, that GCP storage could use the same django authentication server, and so when a browser requested ...google.storage..../media/user_1/verification/passport.png, we could check what permissions this user had, compare it against the uploaded user ID, and decide whether to show a 403 or the actual file.
What is the industry standard / best practice solution for this issue?
Do I make both buckets only accessible to the application, using a service account, and ensure internally that the links are only shared if the correct user is viewing the page? (anyone for static, and {user or staff} for media?)
My questions, specifically (regarding web application security):
Is it safe to serve static files from a publicly readable bucket?
Is it okay to assume that if my application requests a file url, that this is from an authenticated user?
Specifically with regards to Django & GCP Storage, if 2 is false (I believe it is) how do I ensure that files served from buckets are
only visible to users with the correct permissions?
Yes, it is. Public readable buckets are made for that. Things like, CSS, the logo of you company or some files that have no sensible data are safe to share.
Of course, do not use the same Public bucket to store private/public stuff. Public with Public, Private with Private.
Here is the problem. When you say "authenticated user", to whom you want that user to be authenticated to?
For example, if you authenticate your user using any Django methods, then the user will be authenticated to Django, but for Cloud Storage it will be an stranger. Also, even a user authorized on GCP may not be authorized to a bucket on Cloud Storage.
The important thing here is that the one that communicates back and forth with Cloud Storage is not the User, its Django. It could achieve this by using the python SDK of Cloud Storage, which takes the credentials of the service account that is being used on the instance to authenticate any request to Cloud Storage. So, the service account that is running the VM (because you are in Flexible) is the one that should be authorized to Cloud Storage.
You must first authorize the user on Django and then check if the User is able to access this file by other means(Like storing the name of the file he uploaded in a user_uploaded_files table).
Regarding your first question at the top of the post, Cloud Storage lets you create signed urls. This urls allow anyone on the internet to upload/download files from Cloud Storage by just holding the url. So you only need to authorize the user on Django to obtain the signed url and that's it. He does not need to be "authorized" on Cloud Storage(because the url already does it)
Taken from the docs linked before:
When should you use a signed URL?
In some scenarios, you might not
want to require your users to have a Google account in order to access
Cloud Storage, but you still want to control access using your
application-specific logic. The typical way to address this use case
is to provide a signed URL to a user, which gives the user read,
write, or delete access to that resource for a limited time. Anyone
who knows the URL can access the resource until the URL expires. You
specify the expiration time in the query string to be signed.
Following on from Nahuel Varela's answer:
My system now consists of 4 buckets:
static
media
static-staging
media-staging
Both the static buckets are public, and the media buckets are only accessible to the app engine service account created within the project.
(The settings are different for dev / test)
I'm using the django-storages[google]with #elnygrens modification. I modified this to remove the url method for Media (so that we create signed URLS) but keep it in for static (so that we access the public URL of the static files).
The authentication of each file access is done in Django, and if the user passes the test (is_staff or id matches file id), then they're given access to the file for a given amount of time (currently 1 hour), this access refreshes when the page loads etc.
Follow up question: What is the best practice for this time limit, I've heard people use anywhere from 15mins to 24 hours?
I manage a domain of users and would like to be able to transfer all the documents of a user to another user. As far as I understand the best way to achieve that is to find the fileID's of all files belonging to one user and transfer them to another user. However, I have problem constructing a query.
UPDATE:
So the correct query to retrieve the list of files would be:
response = drive_service.files().list(q="'user#company.com' in owners").execute()
However, it only works for me as an admin. If I try to retrieve the list of files for any other user in my domain it returns an empty list.
Files.list will retrieve all the user's files, in this case it will get all your own files. In order for that query to work would be only if that user is also owner one(or more) of your files.
Even as an admin you cannot access users files directly.
To access other user's files, as an admin you need to impersonate the users and then perform actions in their behalf.
This is achieved by using a service account with domain wide delegation of authority.
Here you can find more information on that as well as a python example.
Hope it helps.
If you want to transfer all the files of one user into another user's Drive, the easiest way would be to use the Data Transfer API provided by Google. This way you don't have to list the files and transfer them one by one. Also you only need the admin access token and wouldn't need domain wide delegation either. You can get the official documentation here