Django private file upload - django

I would like admin users to be able to attach private arbitrary files related to my models a la Share Point. As Django is primarily used as a publication platform, all of the examples I've found so far upload files to the static directory, where they are publicly accessible. How can I allow an admin to upload files that have the same auth permissions as the model they are related to?

What you can do is set a certain location in your web server that is classified as internal only, with business logic in your Django application that will send a redirect with a certain header that will allow your web server to serve the file statically.
I have previously done this with Nginx using X-Accel-Redirect so I am going to use that for my example, but I believe there is equivalent functionality in Apache2 and other web servers (X-Sendfile, I think?).
In your Nginx config, set up a location that serves the directory where you are uploading your access-protected files:
location /protected/ {
internal;
alias /var/www-priv/;
}
Files in this directory will not be accessible externally at the URL of /protected/{filepath}, but will be if you return a response from your Django application with the header X-Accel-Redirect = /protected/{filepath}.
Create a view with a url like /media/{filepath} in which you perform the necessary business logic to control access to it (you may want to make the path params slightly more detailed such that you can capture the app label, model and object ID which the file is attached for the purposes of access control e.g. /media/{app_label}/{model}/{object_id}/{filename})
You then just do
response = HttpResponse()
response['X-Accel-Redirect'] = "/protected" + filepath
return response
and Bob's your uncle - the user will be served the protected file.

I think the most powerful way to do this is writing a custom file storage
It's no so difficult (but may be overkill for your needs). Then you can bind your custom storage to your model in this way:
from django.db import models
from django.db.models.fields.files import FileField
from myapp.storage import MyCustomStorage
class MyModel(models.Model):
path = FileField( ... , storage=MyCustomStorage(), blank=False)
Then, you can implement the business logic in your custom storage class. Doing this, you can store your private files on a local file system, data base, or in a remote system like Google AppEngine.
Cheers!

Related

Best way of building middleware for intercepting requests in Django-Vuejs project

I'm working on a Django-Vuejs based project. In my project, a user can have a folder. Inside that, he can create multiple files. Let say, user restricted users from India to access that folder. This folder restriction will now be followed in files as well. But, user can override these settings on file level i.e., user can change restrict settings for single file. So that, if one folder has 5 files. Folder is restricting India and now only one file allowing India. That means, one file will allow India, rest four will restrict India.
My question here is how my middleware should be designed for these settings? Should I create interceptors in Vuejs and define middleware for each route /folder/1 and /file/1 or create custom middleware in Django and check for the requested path in request and accordingly check for the access settings?
What's the best way to achieve this?
I've tried with rest_framework Permissions on folder level. It's working perfectly if the settings are only on folder level. But if settings get changed on file level, it still checks for folder permission and send response accordingly (since file is inside the folder).
I think you don't need a middleware for that, all the logic shall be in the view that will serve the file.
For the permissions, if I were you, I create a Model Called permissions and let the user permissions added to that model.

How to tell Django to put media (pdf) files urls in sitemap.xml?

I've put static views and model views in my Django generated sitemap.xml file but i do not know how to tell Django to put all of the media files in to it? I have a hundred of PDF files with seo friendly links and i want them in my sitemap.xml, but as they are not in correlation with any of my models i don't know how to manage this?
EDIT: I almost forgot one important thing - my media (pdf) files are served through CloudFront so even if i manage somehow to list them in my Django Sitemap.xml i'll have additional problem because they have 'something.cloudfront.com' in their url's and not on my web site's url 'example.com'.
Is this even possible to solve? How does this reflect on SEO?
SOLVED:
#kb, thanks for a great answer! I've used RewriteRule in my htaccess as you suggested in the first part of your answer, and it works fine.
As of second part, instead of creating model for my media files (which would work just fine, but only downside would be manual adding of every new pdf file)
i decided to add some lines to my items() method so i could list bucket content and filter pdf files. With that i can have all of my files up-to-date all the time, easily:
#sitemap.py
import boto
from boto.s3.key import Key
from boto.s3.connection import S3Connection
import re
def items(self):
AWS_ACCESS_KEY_ID = #'my_access_key_number'
AWS_SECRET_ACCESS_KEY = #'my_secret_access_key'
Bucketname = #'my_bucket_name'
conn = boto.s3.connect_to_region('eu-central-1', aws_access_key_id=AWS_ACCESS_KEY_ID, aws_secret_access_key=AWS_SECRET_ACCESS_KEY, is_secure=False, calling_format = boto.s3.connection.OrdinaryCallingFormat())
bucket = conn.get_bucket(Bucketname)
new_list = []
regex = re.compile(r'bucketsubfolder/media.*\.pdf$', re.I) #i hold my media files in bucketsubfolder so url is for example somedomain.cloudfront.net/bucketsubfolder/media/somefile.pdf
for item in bucket.list():
if regex.match(item.name):
new_list.append(item.name)
return new_list
You are not allowed to use external urls in your sitemap (or rather, they won't have the desired effect being indexed by Google as part of your site content).
I think your best option is to dedicate a path on your site like /hosted/pdf/xxxx.pdf that rewrites everything to cloudfront.com/pdf/xxxx.pdf or similar using mod_rewrite/location patterns/regex.
That way you can use a local site URL in your sitemap but still have the browser sent to the cloudfront served content directly, I think this might even be a good use of the 302 HTTP status code.
In the Sitemap class there is an items() method that returns what is to be included in the sitemap.xml, and you could create your own class that extends it and adds additional data.
You can either manually add the data hardcoded in the method but I think the preferred option is to create a Model that represents each remote hosted file and that contains the information necessary to output it in the sitemap. (This also lets you add properties such as visibility on a per file basis and lets you manage it via admin assuming you set up a ModelAdmin for it.)
I think you might be able do something similar to what they show in http://docs.djangoproject.com/en/1.9/ref/contrib/sitemaps with the BlogSitemap class that extends Sitemap. Be sure to check the heading "Sitemap for static views" on that page as well.
My suggestion is that you chose the model approach to represent the files, so you have your hosted PDFs (or other CDN content) as a model called StaticHostedFile or similar and you iterate through all of them in the items() section. It does require you to index all the current PDFs to create models for them as well as create a new model whenever a new PDF is added (but that could be automated).
It can be good to know that you can add "includes" in a sitemap.xml so you might be able to split the site content into two sitemaps (content+pdfs) and include both in sitemap.xml, for instance:
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>http://www.example.com/original_sitemap.xml</loc>
<lastmod>2016-07-12T09:12Z</lastmod>
</sitemap>
<sitemap>
<loc>http://www.example.com/pdf_sitemap.xml</loc>
<lastmod>2016-07-15T08:55Z</lastmod>
</sitemap>
</sitemapindex>
This still requires local URLs and rewrites as per above though, but it can be a nifty trick for when you have several separate sitemaps to combine. (For instance if running a Django site under one subdir and a Wordpress site under another or whatnot.)

Serve media file from Django

I know that it's not a good way to serve directly file and picture from django via views and urls dispatch, but if these files and pictures are served via the server (Apache), the whole world can see them. What if some files and pictures are private for the user, and only the connected user can see these files or pictures? In this case, I need to serve by django itself?
To serve private documents, you should use a Python view that does the security checks.
Here is an example.
If you are using Apache with mod_wsgi then you can use mod_xsendfile
You are essentially looking to run the authorisation for some resources via Django, pass a header back to Apache saying 'Hey dude, lighten up. This user is okay to access this' Apache will then handle returning the resource.
Rough steps (as in, rough enough that you will need to do a little more research using the links I provide as a starting point)
Apache needs to know which resources are public and which aren't. Create a sub directory under media for both of these types (Why not go crazxy and call them /media/public/ and /media/private/)
Set up an alias for the public directory and a WSGIScriptAlias for the protected dir, the protected alias will be pointing to your main site handler (probably django.wsgi)
Add settings to vhost:
XSendFile On
XSendFileAllowAbove On
Add an urlconf to your Django app that handles /media/protected/{whatever} and routes it through your auth Django app auth logic. An example of this is here
A useful snippet for the above is here
and another example for good measure here

Restricting access to static files in Django/Nginx

I am building a system that allows users to generate a documents and then download them. The documents are PDFs (not that it matters for the sake of this question) and when they are generated I store them on my local file system that the web server is running on with uuid file names
c7d43358-7532-4812-b828-b10b26694f0f.pdf
but I know "security through obscurity" is not the right solution ...
I want to restrict access to they files on a per account basis if possible. One thing I think I could do is upload them to S3 and provide a signed URL, but I want to avoid that for now if possible.
I am using Nginx/Django/Gunicorn/EC2/S3
What are some other solutions?
If you are serving small files, you can indeed use Django to serve them directly, writing the file into the HttpResponse object.
If you're serving large files however, you might want to leave that task to your webserver, you can use the X-Accel-Redirect header on Nginx (and X-Sendfile for Apache & Lighttpd) to have your webserver serve the file for you.
You can find more information about the header itself in Nginx's documentation here, and you could find some inspiration as to how to use that in Django here.
Once you're done sending files through Django views, enforcing user authentication should be pretty straightfoward using Django's auth framework.
How about enforcing user==owner at the view level, preventing access to the files, storing them as FileFields, and only retrieving the file if that condition is met.
e.g. You could use the #login_required decorator on the view to allow access only if logged in. This could be refined using request.user to check against the owner of the file. The User Auth section of the Django documentation is likely to be helpful here.
The other option, as you mention is via S3 itself, generating urls within Django which have a querystring allowing an authenticated user access to download a particular s3 object with a time limit. Details on that can be found at the s3 documentation. A similar question has been asked before here on SO.
I've used django-private-files with great success, it enforces protection at the view level and uses differente backends to do the actual file transfer.

User permissions Django for serving media

I want to set up a Django server that allows certain users to access certain media. I'm sure this can't be that hard to do and I'm just being a little bit silly.
For example I want USER1 to be able to access JPEG1, JPEG2 and JPEG3 but not JPEG4, and USER2 to be able to access JPEG3 and JPEG 4.
[I know I should be burnt with fire for using Django to serve up media, but that's what I'm doing at the moment, I'll change it over when I start actually running on gas.]
You can send a file using django by returning the file in the request as shown in Vazquez-Abrams link.
However, you would probably do best by using mod_xsendfile in apache (or similar settings in lighttpd) due to efficiency. Django is not as fast at sending it, one way to do so while keeping the option of using the dev server's static function would be http://pypi.python.org/pypi/django-xsendfile/1.0
As to what user should be able to access what jpeg, you will probably have to implement this yourself. A simple way would be to create an Image model with a many-to-many field to users with access and a function to check if the current user is among those users. Something along the line of:
if image.users_with_access.filter(pk=request.user.id).exists():
return HttpResponse(image.get_file())
With lots of other code of course and only as an example. I actually use a modified mod_xsend in my own project for this very purpose.
You just need to frob the response appropriately.
You can put the media in http://foo.com/media/blah.jpg and set up a media/(?P<file>.*) in urls.py to point to a view blahview that checks the user and their permissions within:
from you_shouldve_made_one_anyways import handler404
def blahview(request,*args,**kwargs):
if cannot_use( request.user, kwargs['username'] ): return handler404(request)
...
Though just to be clear, I do not recommend serving media through Django.