According to Django document: "it was common to place static assets in MEDIA_ROOT along with user-uploaded files, and serve them both at MEDIA_URL. "
Does that mean everyone could access other people's uploaded files?
Isn't it unsafe?
Yes
A clever user can possibly guess the path to media files belonging to other users.
Django was born in the news publishing business where this was not of concern: the admin is based in the concept of trusted users like writers and editors belonging to the same organization.
Mitigation
Requiring authentication
Not my first choice but you can make the webserver authenticate against Django's user database:
WSGIScriptAlias / /path/to/mysite.com/mysite/wsgi.py
WSGIPythonPath /path/to/mysite.com
WSGIProcessGroup %{GLOBAL}
WSGIApplicationGroup %{GLOBAL}
<Location "/media/private-user-content/">
AuthType Basic
AuthName "Top Secret"
Require valid-user
AuthBasicProvider wsgi
WSGIAuthUserScript /path/to/mysite.com/mysite/wsgi.py
</Location>
The accepted answer recommends serving sensitive files from an authenticated Django view. It is OK for low traffic apps but for larger projects it carries a performance hit not every site can afford.
Serving from Cloud Storage Services
Large projects should be using some cloud storage backend for both performance and cost considerations. If your project is already hosted at some of the big 3 (AWS, GCP, Azure) check Django Storages. For example, if you are using the S3 backend, you can turn "query parameter authentication" for generated URLs and voila, problem gone. This has some advantages:
it is transparent to developers
enterprise-grade performance
lower cost of storage and network
probably the most secure option
Obfuscating the path
For small projects where you are serving media and application from the same webserver you can make very hard for nosy users to find media files not belonging to them:
1) disable the web server "auto index" in the MEDIA_ROOT folder. For apache, it is like:
<Directory /path/to/application/media/root>
Options -Indexes
</Directory>
Without indexes, in order to access files belonging to other people you will have to guess the exact file name.
2) make the file path hard to guess using a crypto hash in the "upload_to" parameter from FileFields:
def hard_to_guess(instance, filename):
salt = 'six random words for hidden salt'
hash = hashlib.md5(instance.user.username + salt)
return '/'.join(['content', hash, filename])
...
class SomeModel(models.Model):
...
user = models.ForeignKey(User)
content = models.FileField(upload_to=hard_to_guess)
...
This solution has no performance hit because media files are still served directly from the webserver.
To answer your question: yes, this would allow everyone to access everybody's uploaded files. And yes, this is a security risk.
As a general rule, sensitive files should never be served directly from the filesystem. As another rule, all files should be considered sensitive unless explicitly marked otherwise.
The origin of the MEDIA_ROOT and MEDIA_URL settings probably lie in Django's history as a publishing platform. After all, your editors probably won't mind if the pictures they add to articles can easily be found. But then again, pictures accompanying an article are usually non-sensitive.
To expand on your question: sensitive files should always be placed in a directory that is not directly accessible by the web server. Requesting those files should only be done through a view class or function, which can do some sensible access checking before serving the file.
Also, do not rely on obfuscation for sensitive files. For example, let's use Paulo's example (see other answer) to obfuscate photo albums. Now my pictures are stored like MEDIA_URL/A8FEB0993BED/P100001.JPG. If I share this link with someone else, they can easily try URLs like MEDIA_URL/A8FEB0993BED/P710032.JPG, basically allowing them to brute-force my entire photo album.
Related
I know that it's not a good way to serve directly file and picture from django via views and urls dispatch, but if these files and pictures are served via the server (Apache), the whole world can see them. What if some files and pictures are private for the user, and only the connected user can see these files or pictures? In this case, I need to serve by django itself?
To serve private documents, you should use a Python view that does the security checks.
Here is an example.
If you are using Apache with mod_wsgi then you can use mod_xsendfile
You are essentially looking to run the authorisation for some resources via Django, pass a header back to Apache saying 'Hey dude, lighten up. This user is okay to access this' Apache will then handle returning the resource.
Rough steps (as in, rough enough that you will need to do a little more research using the links I provide as a starting point)
Apache needs to know which resources are public and which aren't. Create a sub directory under media for both of these types (Why not go crazxy and call them /media/public/ and /media/private/)
Set up an alias for the public directory and a WSGIScriptAlias for the protected dir, the protected alias will be pointing to your main site handler (probably django.wsgi)
Add settings to vhost:
XSendFile On
XSendFileAllowAbove On
Add an urlconf to your Django app that handles /media/protected/{whatever} and routes it through your auth Django app auth logic. An example of this is here
A useful snippet for the above is here
and another example for good measure here
Regarding this documentation page from the Django website,
https://docs.djangoproject.com/en/1.2/howto/static-files/
where it says for development "With that said, Django does support static files during development. You can use the django.views.static.serve() view to serve media files."
So my question is, if I use this method, How much work is required to move to apache.
Currently I have a symbolic link to my image folder in the /var/www folder, and in the Django settings I have set the media url to :
MEDIA_URL = 'http://127.0.0.1:80/Images/'
This seems like a fairly easy hack but my project is going to get very big (with lots of css, js and pdfs) and I doubt if this method is good.
My approach was to have apache itself intercept the static files urls and serve them directly without invoking django at all. So my apache config looked something like this:
<VirtualHost *:80>
ServerName www.myproject.com
Alias /static /docs/my_website/static
<Directory /docs/my_website/static>
Order allow,deny
Allow from all
</Directory>
Alias /favicon.ico /docs/my_website/static/images/icons/favicon.ico
Include "/13parsecs/conf/django.conf"
</VirtualHost>
Then you can just keep doing whatever you're doing in the dev environment, and when you get to apache it won't invoke django at all for static content, which is what you want.
This is a perfectly good way of doing things. The only change you'll need to make is to put the actual URL of your site, rather than the localhost IP.
Don't "move to Apache", start using it in the first place. None of the software needed has licensing fees and runs on almost any platform, so the only excuse you could have is "I'm too lazy".
Currently I am having three sites for example let it be site1, site2 and site3 . Each site require authentication. Both site1 and site2 take the same database let it be "Portfolio" database and site3 is having a different database let it be "site3specific" database.
I am planning to have a Common Account database for keeping the login credentials of users for the all different sites available. So that each sites (i.e. site1, site2 and site3) will make use of the Common Account database for authenticating the user login. I am planning to keep the user details in a separate database since all the three sites in development, testing and live environment can share the same user credentials without redundancy. Also each site may have its own specific data that we may be having or entering differently in development, staging and live environments.
Also there is a possibility of sharing some data between sites.
Could anyone please tell me how can I achieve these task in django + Apache + mod_wsgi.
Please advice whether I need to have a globally shared settings file , model file and urls file. IF then how my globally shared settings files need to be modified . Please advice.
This is how we currently operate.
Each site has its own VirtualHost entry in the httpd.conf, and each app has its own django.wsgi config file which looks something like this (you can probably use a simpler one):
import os, sys, site, glob
prev_sys_path = list(sys.path)
root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
site.addsitedir(glob.glob(os.path.join(root_dir, 'venv/lib/python*/site-packages'))[0])
sys.path.append('/usr/local/django-apps')
sys.path.append('/usr/local/django-apps/AppName')
new_sys_path = []
for item in list(sys.path):
if item not in prev_sys_path:
new_sys_path.append(item)
sys.path.remove(item)
sys.path[:0] = new_sys_path
os.environ['DJANGO_SETTINGS_MODULE'] = 'AppName.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
The VirtualHost needs to contain entries like this:
SetEnv DJANGO_ENV ${environment
WSGIDaemonProcess appname user=apache group=apache processes=2 threads=15 display-name=%{GROUP}
WSGIProcessGroup appname
WSGIScriptAlias / /usr/local/django-apps/AppName/apache/django.wsgi
<Directory /usr/local/django-apps/AppName/apache>
Order deny,allow
</Directory>
From there, the database set up is dependent on what database engine you're using.
Hope this helps.
You have to look at your requirements, and see if all sites would perhaps require and if so respect a single sign-on (sso) service. If that is the case, then you might need to look at how sessions are transfered between sites as sessions are SITE_ID specific. So, just making it work may be a great start, but looking at the big picture before you dig too deal in might be a good idea.
I set the same session name in these sites (a.xx.com/b.xx.com/c.xx.com -> sesssion name=xx.com). In my Django project, I used three settings files for each site and used the manager.py to separate these sites. The last step, start them up separately.
I'm serving "sensitive" information in downloadable PDF's and Spreadsheets within a user registration section of a site.
Is there a way to allow the django authentication to secure this media without serving it (and not have to manually login using basic auth)?
I'm guessing theres (fingers crossed) not a way to do it with the psuedo code below, but it helps better illustrate the end goal.
#urls.py
(r'^protected_media/(?P<filename>.*)$', 'protected_media')
#views.py
from django.contrib.auth.decorators import login_required
#login_required
def protected_media(request, filename):
# #login_required bounces you out to the login url
# if logged in, serve "filename" from Apache
It seems to me that the method you outlined in your code should work. It's really no different than any other protected resource: your views can serve files from disks, records from databases, rendered templates or anything. Just as the login_required decorator prevents unauthorized access to other views, it will prevent such access to your view serving protected media.
Am I missing something from your question here? Please clarify if that's the case.
EDIT: With regard to the django doc link in your comment: that's the method for simply serving any request file from a particular directory. So, in that example URLS like /site_media/foo.jpg, /site_media/somefolder/bar.jpg will automatically look for files foo.jpg and somefolder/bar.jpg under document_root. Basically, every thing under document_root will be publicly available. That's obviously insecure. So you avoid that with your method.
It's also considered inefficient because django is just adding a lot of unnecessary overhead when all you need is something like Apache to take a URL request and map it to a file on the hard drive. (You don't need django sessions, request processing, etc.)
In your case, this may not be such a big concern. First, you've secured the view. Second, it depends on your usage patterns. How many requests do you anticipate for these files? You're only using django for authentication -- does that justify other overhead? If not, you can look into serving those files with Apache and using an authentication provider. For more on this, see the mod_wsgi documentation:
http://code.google.com/p/modwsgi/wiki/AccessControlMechanisms
see the section "Apache Authentication Provider" and search for django
There are similar mechanisms available under mod_python I believe. (Update: just noticed the other answer. Please see Andre's answer for the mod_python method.)
EDIT 2: With regard to the code for serving a file, please see this snippet:
http://www.djangosnippets.org/snippets/365/
The send_file method uses a FileWrapper which is good for sending large static files back (it doesn't read the entire file into memory). You would need to change the content_type depending on the type of file you're sending (pdf, jpg, etc).
Read this Django ticket for more info. Start at the bottom to save yourself some time. Looks like it just missed getting into Django 1.2, and I assume also isn't in 1.3.
For Nginx, I found this Django snippet that takes advantage of the X-Accel-Redirect header, but haven't tried it yet.
If I understand your question correctly you want to restrict access to files that are not being served by Django, for example, with an Apache server?
What you would then require is some way for this Apache server to use Django as an authentication source.
This django snippet describes such a method. It creates an access handler in Django which is used by Apache when a request for a static file comes in that needs to be protected:
<Location "/protected/location">
PythonPath "['/path/to/proj/'] + sys.path"
PythonOption DJANGO_SETTINGS_MODULE myproj.settings
PythonOption DjangoPermissionName '<permission.codename>'
PythonAccessHandler my_proj.modpython #this should point to accesshandler
SetHandler None
</Location>
Hope this helps, the snippet was posted a while ago, so things might have changed between Django versions :)
More efficient serving of static files through Django is being looked at currently as part of Google SOC project. For WSGI this will use wsgi.file_wrapper extensions for WSGI if available, as it is for mod_wsgi, and req.sendfile() if using mod_python. It will also support returning of headers such as 'Location', 'X-Accel-Redirect' and others, which different web hosting mechanisms and proxy front ends accept as a means of serving up static files where location is defined by a backend web application, which isn't as effecient as front end for serving static files.
I am not sure if there is a project page for this in Django wiki somewhere or not, but the code changes are being committed into the branches/soc2009/http-wsgi-improvements branch of Django source code repository.
You needn't strictly wait for that stuff. It is just putting a clean and portable interface in place across the different mechanisms. If using nginx as front end in front of Apache/mod_wsgi, you could use X-Accel-Redirect now. If using Apache/mod_wsgi 3.0 and daemon mode, you could use Location now, but do need to ensure you set up Apache correct. Alternatively, you could implement your own WSGI middleware wrapper around the Django application which looks for some response header of your own to indicate file to be returned and which uses wsgi.file_wrapper to return that instead of actual response returned from Django.
BTW, the authentication hook mechanisms listed for both mod_python and mod_wsgi by others would use HTTP basic authentication, which isn't what you wanted. This is presuming you want files to be protected by Django form based login mechanism using cookies and backend sessions.
What options are there for installing Django such that multiple users (each with an "Account") can each have their own database?
The semantics are fairly intuitive. There may be more than one User for an Account. An Account has a unique database (and a database corresponds to an account). Picture WordpressMU. :)
I've considered this:
External solution - Multiplex to multiple servers/daemons
Multiple Django installations, with each Django installation / project corresponding to an account that sets its own DATABASE_NAME, e.g.
File system:
/bob
/settings.py (contains DATABASE_NAME="bob")
/sue
/settings.py (contains DATABASE_NAME="sue")
Then having a Django instance running for each of bob and sue. I don't like this methodology- it feels brutish and it smells foul. But I'm confident it would work, and based on the suggestions it might be the cleanest, smartest way to do it.
The apps can be stored elsewhere; the only thing that need be unique to the django configuration is the settings.py (and even there, only DATABASE_NAME, etc. need be different, the rest can be imported).
(Incidentally, I'm using lighttpd and FastCGI.)
Internal solution - Django multiplexing database settings
On the other hand, I've thought of having one single Django installation, and
(a) Adding a "prefix_" to each database table, corresponding to account of the logged-in user; or
(b) Changing the database according to the account of the User that is logged in.
I'd be particularly interested in seeing the "Django way" to do these (hoping that it's something dead-simple). For example, middleware that takes a Request's User and changes the django.conf.SETTINGS['DATABASE_NAME'] to the database for this user's account.
This raises red flags, viz. Is this thread-safe? i.e. Does changing django.conf.SETTINGS affect other processes? Is there just an inherent danger in changing django.conf.SETTINGS -- would the DB connection be setup already? Is restarting the DB connection part of the public API? -- I'm going to have a look at the Django source when I look to this problem again.
I'm conscious that 2(a) and (b) could require User authentication to be stored and accessed in a different mechanism that the core.
For now, I'm going to go with the external mapping at the webserver layer- it's simplest and cleanest for now. However, I don't like the idea of FastCGI daemons running for every account- it seems to needlessly waste memory, particularly if there will be 2000+ accounts. However, I'd like to keep this discussion open as it's an interesting problem and the solution doesn't seem ideal for certain cases.
Comments duly appreciated.
Cheers
The Django way would definitely be to have separate installations with their own database name (#1). #2 would involve quite a bit of hacking with the ORM, and even then I'm not quite sure it's possible at all.
But mind you, you don't need a WHOLE new installation of all the site's models/views/templates for each user, just a new settings.py with all the appropriate paths to the common source files. Plus, to run all these installations in Apache, do it the way I do here:
<VirtualHost 1.2.3.4>
DocumentRoot /www/site1
ServerName site1.com
<Location />
SetHandler python-program
SetEnv DJANGO_SETTINGS_MODULE site1.settings
PythonPath "['/www'] + sys.path"
PythonDebug On
PythonInterpreter site1
</Location>
</VirtualHost>
<VirtualHost 1.2.3.4>
DocumentRoot /www/site2
ServerName site2.com
<Location />
SetHandler python-program
SetEnv DJANGO_SETTINGS_MODULE site2.settings
PythonPath "['/www'] + sys.path"
PythonDebug On
PythonInterpreter site2
</Location>
</VirtualHost>
assuming you've got /www/site1/settings.py, www/site2/settings.py and so on...
Of course, you now need to have a main site where people log in, that then redirects you to the appropriate site (here I've just put it as "site1.com", "site2.com", but you get the idea.)
The Django ORM doesn't provide multiple database support classes, but it is definitely possible - you'll have to write a custom manager and make a few other tweaks. Eric Florenzano has a great article with detailed code samples:
http://www.eflorenzano.com/blog/post/easy-multi-database-support-django/