Allow specific users to see specific files - Django - django

I'm building a system to manage documents using Django, so I want to allow specific users (based on their ID o email) to read specific documents (uploaded by the same user or shared with another one specified by the file creator).
How can I build this feature?

Just resolved by using Google Buckets: every user has a bucket on Google Storage and the database in django webapp stores relationship user/bucket/credentials for access.

Related

How should a web application ensure security when serving confidential media files?

Question: Say a user uploads highly confidential information. This is placed in a third party storage server. This third party bucket uses different authentication systems to the web application. What is the best practice for ensuring only the user or an admin staff member can access the file url?
More Context: A Django web application is running on Google App Engine Flexible. Google Storage is used to serve static and media files through Django. The highly confidential information is passports, legal contracts etc.
Static files are served in a fairly insecure way. The /static/ bucket is public, and files are served through django's static files system. This works because
there is no confidential or user information in any of our static
files, only stock images, css and javascript, and
the files are uglified and minifed before production.
For media files however, we need user specific permissions, if user A uploads an image, then user A can view it, staff can view it, but user B & unauthenticated users cannot under any circumstances view it. This includes if they have the url.
My preferred system would be, that GCP storage could use the same django authentication server, and so when a browser requested ...google.storage..../media/user_1/verification/passport.png, we could check what permissions this user had, compare it against the uploaded user ID, and decide whether to show a 403 or the actual file.
What is the industry standard / best practice solution for this issue?
Do I make both buckets only accessible to the application, using a service account, and ensure internally that the links are only shared if the correct user is viewing the page? (anyone for static, and {user or staff} for media?)
My questions, specifically (regarding web application security):
Is it safe to serve static files from a publicly readable bucket?
Is it okay to assume that if my application requests a file url, that this is from an authenticated user?
Specifically with regards to Django & GCP Storage, if 2 is false (I believe it is) how do I ensure that files served from buckets are
only visible to users with the correct permissions?
Yes, it is. Public readable buckets are made for that. Things like, CSS, the logo of you company or some files that have no sensible data are safe to share.
Of course, do not use the same Public bucket to store private/public stuff. Public with Public, Private with Private.
Here is the problem. When you say "authenticated user", to whom you want that user to be authenticated to?
For example, if you authenticate your user using any Django methods, then the user will be authenticated to Django, but for Cloud Storage it will be an stranger. Also, even a user authorized on GCP may not be authorized to a bucket on Cloud Storage.
The important thing here is that the one that communicates back and forth with Cloud Storage is not the User, its Django. It could achieve this by using the python SDK of Cloud Storage, which takes the credentials of the service account that is being used on the instance to authenticate any request to Cloud Storage. So, the service account that is running the VM (because you are in Flexible) is the one that should be authorized to Cloud Storage.
You must first authorize the user on Django and then check if the User is able to access this file by other means(Like storing the name of the file he uploaded in a user_uploaded_files table).
Regarding your first question at the top of the post, Cloud Storage lets you create signed urls. This urls allow anyone on the internet to upload/download files from Cloud Storage by just holding the url. So you only need to authorize the user on Django to obtain the signed url and that's it. He does not need to be "authorized" on Cloud Storage(because the url already does it)
Taken from the docs linked before:
When should you use a signed URL?
In some scenarios, you might not
want to require your users to have a Google account in order to access
Cloud Storage, but you still want to control access using your
application-specific logic. The typical way to address this use case
is to provide a signed URL to a user, which gives the user read,
write, or delete access to that resource for a limited time. Anyone
who knows the URL can access the resource until the URL expires. You
specify the expiration time in the query string to be signed.
Following on from Nahuel Varela's answer:
My system now consists of 4 buckets:
static
media
static-staging
media-staging
Both the static buckets are public, and the media buckets are only accessible to the app engine service account created within the project.
(The settings are different for dev / test)
I'm using the django-storages[google]with #elnygrens modification. I modified this to remove the url method for Media (so that we create signed URLS) but keep it in for static (so that we access the public URL of the static files).
The authentication of each file access is done in Django, and if the user passes the test (is_staff or id matches file id), then they're given access to the file for a given amount of time (currently 1 hour), this access refreshes when the page loads etc.
Follow up question: What is the best practice for this time limit, I've heard people use anywhere from 15mins to 24 hours?

Best practice to handle different group of users accessing their own content

I am building a web app where different companies will upload their own audio files with some additional information. I am building it using Django, Postgres and hosting it on AWS. Users belong to different companies will only be able to access their data when they log into the website.
The website allows those users to upload content, search content and access content.
My question is, what's the best practice to handle those uploaded content? Is it better to create different schema for each company or putting all the content together and allow users to access different content based on the company id that each entry associates with?
putting all the content together and allow users to access different content based on the company id that each entry associates with?
Personally, I would do this, for several reasons:
It's easier to maintain. Adding new companies probably just means a new ID, rather than a new schema and some tables.
You can add security with application code or with database views.
You can have other company specific functionality that uses the same design.
I would also suggest enforcing the data security on the database side, by only allowing the application to query from certain views, where the views are limited by company ID. This means that you won't accidentally SELECT from a base table and forget the company filter, causing the user to see data that isn't theirs.
This is just my opinion - happy to be proven otherwise.

Google Drive Change File Ownership Using RESTapi in Python

I manage a domain of users and would like to be able to transfer all the documents of a user to another user. As far as I understand the best way to achieve that is to find the fileID's of all files belonging to one user and transfer them to another user. However, I have problem constructing a query.
UPDATE:
So the correct query to retrieve the list of files would be:
response = drive_service.files().list(q="'user#company.com' in owners").execute()
However, it only works for me as an admin. If I try to retrieve the list of files for any other user in my domain it returns an empty list.
Files.list will retrieve all the user's files, in this case it will get all your own files. In order for that query to work would be only if that user is also owner one(or more) of your files.
Even as an admin you cannot access users files directly.
To access other user's files, as an admin you need to impersonate the users and then perform actions in their behalf.
This is achieved by using a service account with domain wide delegation of authority.
Here you can find more information on that as well as a python example.
Hope it helps.
If you want to transfer all the files of one user into another user's Drive, the easiest way would be to use the Data Transfer API provided by Google. This way you don't have to list the files and transfer them one by one. Also you only need the admin access token and wouldn't need domain wide delegation either. You can get the official documentation here

Google directory API - List of modified users

I need to synchronize the users of a domain with a third party database.
Using the method users.list I can query for the complete list of users.
With the comlete list I can identify created users and deleted users. I don't find any way identify updated users.
Is there a way to identify updated users?
Use Users.watch. It lets you know of any changes made to users. When changes are made you can update your database.

Accessing database using only one user id and password

Can we get the file from database using only one login from multiple users?
Let me explain to you suppose I have one database and only one log-id and password for database.
I want to use this database for multiple users across the globe each and every user ask for different files (while user doesn't have this database id and password) at the same time with this login id and password.
I want to create new layer between the database and the user to get these files.
Is this possible or I can say feasible and what are the pros and cons?
Normally database or data source access is encapsulated in Data Access Object that resides a Data Access Layer. In Java this can be done using JDBC API or Object-Relational Mapping framework such as Hibernate or iBatis. Since this questions is tagged with c++, ODB(C++) is an option.