Django prevent image upload containing possible XSS code

Django prevent image upload containing possible XSS code - django

I am creating a site that users can upload images. I am using django-storages to forward these images to S3 bucket, but I recently read the security docs on Django's site: https://docs.djangoproject.com/en/3.0/topics/security/#user-uploaded-content
Django’s media upload handling poses some vulnerabilities when that media is served in ways that do not follow security best practices. Specifically, an HTML file can be uploaded as an image if that file contains a valid PNG header followed by malicious HTML. This file will pass verification of the library that Django uses for ImageField image processing (Pillow). When this file is subsequently displayed to a user, it may be displayed as HTML depending on the type and configuration of your web server.
It tells me about this vulnerability but it does not provide me an efective way of protecting against these vulnerabilities. Which is the top 3rd most vulnerable attack in websites.
Consider serving static files from a cloud service or CDN to avoid some of these issues.
I am using S3 to serve my media files, it does say to avoid some of the vulnarabilities described in the section, but is does not say which.
My question: Is uploading and serving images to and from AWS S3 vulnerable to these attacks, and if it does not, what is an effective way of sanitizing the content of the image ?
Edit for bounty: I host the images on S3, what are type of attack or vulnerabilities can happen ? And how to prevent such attacks ?

Why not just verify that the file is a valid image?:
from PIL import Image
image = Image.open(file)
image.verify()
As another poster has suggested, you can indeed attempt a transformation and check if an exception is thrown, but verify() will probably be quicker.
Or maybe you can try detecting the type?:
import imghdr
path = 'Image.jpg'
imghdr.what(path)
Or
from PIL import Image
image = Image.open('myimage.png')
image.format
Using any of the above methods, you can determine if the file is actually an image or not. If it is not an image, then consider the file as spurious, and do not output it on any of your web pages. By not outputting the file, there is no risk of XSS from this vector, because even if the file is HTML, by not outputting it on your page, it cannot compromise your page.

The best possible solution comes from the Pillow library itself.
Even if someone can Manipulate the headers of an HTML File to make them look as PNG files, when you try doing some operation on them (say resizing), it will simply not work and throw an error, so you can capture it inside a try except block and warn/flag the user for malicious intent.
If you don't want to reduce the quality of any image given to you, then you can resize to the original image size, it will work without compromising the quality of image.

Related

What is safe implementation for sensitive data file file_url in Django

I am providing sensitive username and password file to the authenticated user. I want user to download the file via file_url in template through model.
File_link = models.FileField(upload_to='SAFE_DIRECTORY_PATH')
I don't feel it safe storing it in media directory
Any suggestions keeping them safe ,web app will be generating the link.

Some security notes first.
This is probably a bad idea. Storing sensitive information in plain files is probably not the correct security approach, especially if you plan to use Django's media storage backend for doing that. It leaves all files out-in-the-open.
If however you really, really, and I mean really need to do that, you should encrypt the file first before saving in Django.
Again though, if at all possible I would recommend to store sensitive information in db. In your case of storing passwords, you can use Django techniques to store that information relatively-safely such as correctly hashing passwords via pbkdf function (e.g. pbkdf or bcrypt, etc). If users will need to download that information, you can always generate the file on the fly for them for download.
Some suggestions for uploading files.
I usually assign random filenames to the uploaded files. This way at least its more challenging for the users to guess the filenames to download them. Not very security since this relies on security by obfuscation but its better then nothing. If you need a Django field which does that automatically, you can do that by making upload_to a callable (there are also 3rd party libs for doing that such as django-auxilium although for full disclosure Im the author of that lib).
Now that files are stored with random filenames, you probably never want to provide direct download links to the users for download but instead authenticate them first and then use something like X-Accel in nginx or X-Sendfile in Apache to actually serve the file to the user. The idea being that you first authenticate user in Django. Then however instead of Django serving the file, you return a special header which nginx/apache catches which contains a filepath to the file nginx/apache should serve to the user. This way you dont have to waste resources in Django to serve the file however you still get the advantage of being able to authenticate the request. There are a number of 3rd party apps for doing that as well.
Finally to protect users from downloading the media files you can use nginx (and I imagine apache) by restricting certain parts of the media folder:
location /media/protected {
internal;
alias /var/www/files;
}
In this case nginx will refuse direct user requests to /media/protected and will only allow to serve those files via X-Accel-Redirect header sent by Django. Then all you have to configure in Django is to store files in that path to make them protected:
models.FileField(upload_to='protected/myfiles')

I was looking for a solution to serve files only to authorized users and came across this post. I think it it is top google result for "django storing and providing secure files"
As the answer is rather old I wanted to share my finding:
django-private-storage (https://pypi.org/project/django-private-storage/) seems to be a good solution to this problem.

Sharing files from google cloud storage to GAE

I have one Django application running GAE.The application uses content folder which contains images and html snippets.The content folder was uploaded in google cloud storage.I would like to render a image in static file using img tag.For using img tag I want to know the url of that image.I have seen that when we set the permission to share publicly it will give us a url.But I don't want to share that files publicly.If I share an another application can use my files.I don't want that.I there any way to do that with out log in a user

Sharing it publicly is the best way to go.
You could also base64 encode the image data when you render out the template, which means the url of the image will not be shown to the public on your page. Then you can obfuscate the image names in the GCS. This way it's still public but hard to reach.

Restricting access to static files in Django/Nginx

I am building a system that allows users to generate a documents and then download them. The documents are PDFs (not that it matters for the sake of this question) and when they are generated I store them on my local file system that the web server is running on with uuid file names
c7d43358-7532-4812-b828-b10b26694f0f.pdf
but I know "security through obscurity" is not the right solution ...
I want to restrict access to they files on a per account basis if possible. One thing I think I could do is upload them to S3 and provide a signed URL, but I want to avoid that for now if possible.
I am using Nginx/Django/Gunicorn/EC2/S3
What are some other solutions?

If you are serving small files, you can indeed use Django to serve them directly, writing the file into the HttpResponse object.
If you're serving large files however, you might want to leave that task to your webserver, you can use the X-Accel-Redirect header on Nginx (and X-Sendfile for Apache & Lighttpd) to have your webserver serve the file for you.
You can find more information about the header itself in Nginx's documentation here, and you could find some inspiration as to how to use that in Django here.
Once you're done sending files through Django views, enforcing user authentication should be pretty straightfoward using Django's auth framework.

How about enforcing user==owner at the view level, preventing access to the files, storing them as FileFields, and only retrieving the file if that condition is met.
e.g. You could use the #login_required decorator on the view to allow access only if logged in. This could be refined using request.user to check against the owner of the file. The User Auth section of the Django documentation is likely to be helpful here.
The other option, as you mention is via S3 itself, generating urls within Django which have a querystring allowing an authenticated user access to download a particular s3 object with a time limit. Details on that can be found at the s3 documentation. A similar question has been asked before here on SO.

I've used django-private-files with great success, it enforces protection at the view level and uses differente backends to do the actual file transfer.

Need help setting up django-filetransfers

My setup is: Django 1.3/Python 2.7.2/Win Server 2008 R2/IIS 7.5/MS SQL Server 2008 R2. I am developing an application whose main function is to analyze uploaded files and produce a report.
Reading over the documentation for django-filetransfers, I believe this is a solution to a problem I've been trying to solve for a while (i.e. form-based file uploads completely block all Django responses until the file-transfer finishes...horror for even moderate-sized files).
The documentation talks about piping uploads to S3 or Blobstore, and that might be what I end up doing eventually, but during development I thought maybe I could just set up my own "poor-man's S3" on a server that I control. This would basically just be another Django instance (or possibly a simple ASP.NET app) whose sole purpose is to receive uploaded files. This sounds like it should be possible with django-filetransfers and would solve the problem of Django responsiveness (???).
But I am missing some bits of understanding how this works in general, as well as some specifics. Maybe an example will help: let's say I have MyMainDjangoServer and MyFileUploadServer. MyMainDjangoServer will serve the views, including the upload form. MyFileUploadServer will "catch" the uploaded files. My questions/confusion are as follows:
My upload form will contain additional fields beyond just the file(s)...do I understand correctly that MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?
I assume that what I would need to do on MyFileUploadServer is have a view/URL that handles the form request and sucks out the request.FILES data. What else needs to happen? What happens to the rest of the form data?
How would I set up my settings.py for this scenario? The django-filetransfers examples seem to assume either S3 or GAE/Blobstore but maybe I am missing some basics.
Any advice/answers appreciated...this is a confusing and frustrating area of Django for me.

"MyMainDjangoServer will somehow still get that form data, minus the file data (basically: request.POST), and the file data gets shunted over to MyFileUploadServer? How does this work? Will MyMainDjangoServer still block during the upload to MyFileUploadServer?"
I know the GAE Blobstore, presumably S3 as well, handles this by requiring you to give it a success_url. In your case that would be the url on MyMainDjangoServer where your file receiving view on MyFileUploadServer would re-post the non-files form data to once the upload is complete.
Have a look at the create_upload_url method here: https://developers.google.com/appengine/docs/python/blobstore/functions
You need to recreate this functionality in some form (see below).
"How would I set up my settings.py for this scenario?"
You'd need to create your own filetransfers backend which would be a file with a prepare_upload function in it.
You can see the App Engine one here:
https://github.com/django-nonrel/djangoappengine/blob/develop/storage.py
The prepare_upload method just wraps the GAE create_upload_url method mentioned above.
So in your settings.py you'd have something like:
PREPARE_UPLOAD_BACKEND = 'myapp.filetransfers_backend.prepare_upload'
(i.e. the import path to your prepare_upload function)
For the rest you can start with the ones provided by filetransfers already:
SERVE_FILE_BACKEND = 'filetransfers.backends.url.serve_file'
# if you need it:
PUBLIC_DOWNLOAD_URL_BACKEND = 'filetransfers.backends.url.public_download_url'
These rely on the file_field.url being set (see Django docs) and since your files will be on a separate server you probably need to look into writing a custom storage backend for Django too. (the S3 and GAE cases assume you're using the custom Django storage backends from here)

Django: control access to "static" files

Ok, I know that serving media files through Django is a not recommended. However, I'm in a situation where I'd like to serve "static" files using fine-grained access control through Django models.
Example: I want to serve my movie library to myself over the web. I'm often travelling and I'd like to be able to view any of my movies wherever I am, provided I have internet access. So I rip my DVDs, upload them to my server and build this simple Django application coupled with some embeddable video player.
To avoid any legal repercussions, I'd like to ensure that only logged-on users with the proper permissions (i.e. myself and people living in the same household, which can, like me, access the real DVDs at their convenience), but denies it to other users (i.e. people who posted comments on my blog) and returns an HTTP 404.
Now, serving these files directly using Apache and mod_wsgi is rather troublesome because when an HTTP request for the media files (i.e. http://video.mywebsite.com/my-favorite-movie/) comes in, I need to validate against my user database that the person at the other end has the proper permissions.
Question: can I achieve this effect without serving the media files directly through a Django view? What are my options?
One thing I did think of is to write a simple script that takes a session ID and a video's slug and returns some boolean indicating if the user may (or may not) access the video file. Then, somehow request mod_wsgi to execute this script before accessing the requested URL and return an HTTP 404 if the script failed. However, I don't have a clue if this is even possible.
Edit: Posting this question clarified some of my ideas for search and I've come across mod_python's file wrapper extension. Does anyone have enough experience with that to validate that it is a viable solution?

Yes, you can hook into Django's authentication from Apache. See this how-to:
Authenticating against Django’s user database from Apache

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js