Django, creating cache key from request object

Django, creating cache key from request object - django

I have been assigned a new project. In this django project the previous developer uses django.core.cache module a lot.
I decided to keep it like this.
My question is this. Can I make a unique string out of the request object that would let me know if the request object is the same as before?
Request comes with a set of 15 parameters (even more) and it is hard to choose one or some of them to create a key. It has to be all, because different combinations lead to different results.
This is the code I want to change (some code left out for brevity):
#login_required
def compare(request, username):
cache_key = 'key__%d' % (request.GET.to_unique_id_or_similar())
cache_value = cache.get(cache_key)
if cache_value is not None:
return cache_value

Django provides a super easy way to do this with the #cache_page decorator.

Related

Django is there a benefit of checking every view with `method=='POST':`

Is there a benefit to starting every one of my view functions with if request.method=='POST': or if request.method=='GET':? Or would I just be adding unnecessary lines of code?
I've followed a few examples where views for Ajax are all checking if the HTTP is made with GET.
Could it, for example, prevent DDOS from latching on to a POST method and hammering it with GETs? Or, more practically, prevent API consumers from incorrectly PATCHing when they should PUT or POST?
def employee_delete(request, uid):
if request.method == 'DELETE':
def employee_detail(request, uid):
if request.method == 'GET':
def employee_create(request):
if request.method == 'POST':
def employee_update(request, uid):
if request.method == 'PUT':

Is there a benefit to starting every one of my view functions with if request.method=='POST':
Yes, even if you only support one method, it is better to guard this. The HTTP protocol specifies that GET requests, should not have side-effects (well effects in the sense of counting visitors are probably no problem, but not something that changes the "entities" of your business logic is strictly speaking not acceptable).
Now "web crawlers" (for example used by search engines or scrapers) typically detect links on a page, and make GET requests on these links (since they aim to "discover" new pages). If there is a view behind this URL that, for example, deletes an employee, it can happen that accidentally a "web crawler" will edit your database.
Other methods like GET, HEAD, PUT and DELETE should be idempotent (that means that making the same request twice, should have the same side-effects, as making the request only once).
So by not "protecting" your views, you lose a layer of "protection" against accidental misuse of your webserver.
A hacker could also aim to make a request with another method, and see how the server responds in a search to find exploits: for example, look if a server makes certain assumptions on the method that fail when performing a DELETE request. For example, a rather generic view implementation that handles all methods, could - if not guarded - unintentionally allow the deletion of files (for example you write a generic view to create and edit content can be "misused" by a hacker by using a DELETE request that the parent view implemented, but should not be supported for that specific entity).
In the early days, some HTTP webservers for example did not check authentication when a HEAD request was used. As a result, a hacker could, by trying several HEAD requests "scan" the id space, and thus obtain knowledge what id's were filled in in the database. Of course that in itself does not leak much data, but it is a vulnerability that can be used as a first step in hacking data.
Note that although Django has some protection against this when using, for example, class-based views, a person can just use any string for the request. So a person can write as method FOOBAR. If the view for example specifies if request.method == 'POST', and an else: statement, it can thus be used, to enter the else statement with a non-GET method.
But regardless of the use-case, "better be safe than sorry", and guarding the HTTP methods, is just one of the aspects to check.
That being said, if only a subset of methods are allowed, you can use the #require_http_methods [Django-doc] decorator:
from django.views.decorators.http import require_http_methods
#require_http_methods(["GET", "POST"])
def my_view(request):
# I can assume now that only GET or POST requests make it this far
# ...
pass
This decorator thus makes it more elegant to guard that the proper method is used.

To offer a different perspective, I think your question illustrates why you should consider using class-based views, which make life so much simpler when dealing with such problems.
For example the generic CreateView already comes with all the logic built in to restrict the type of HTTP request. It will let you perform a GET request to initialise a form, but require a POST request to process data. Thus you can't accidentally trigger data to be saved through a GET request.
It also provides the framework for proper form data validation, error handling etc - which you would have to implement yourself in a procedural view.
Same goes for the range of other views that Django provides - UpdateView, DetailView etc.
All Django class-based views come with a http_method_names attribute that you can use to control which methods are allowed on your views, e.g.,
from django.views.generic import View
class MyView(View):
# only GET and POST allowed. Anything else will get a 405 Method Not Allowed response.
http_method_names = ['get', 'post']
def get(self, request, *args, **kwargs):
# Logic for GET requests goes here.
def post(self, request, *args, **kwargs):
# Logic for POST requests goes here. No risk of it getting mixed up with GET.
This in addition to providing a lot of other helpers for things like form handling, template loading etc. Procedural views may feel simpler initially, but you will quickly realise that you end up having to write a lot more code to get them to do what you need.

How to safely access request object in Django models

What I am trying to do:
I am trying to access request object in my django models so that I can get the currently logged in user with request.user.
What I have tried:
I found a hack on this site. But someone in the comments pointed out not to do it when in production.
I also tried to override model's __init__ method just like mentioned in this post. But I got an AttributeError: 'RelatedManager' object has no attribute 'request'
Models.py:
class TestManager(models.Manager):
def user_test(self):
return self.filter(user=self.request.user, viewed=False)
class Test(models.Model):
def __init__(self, *args, **kwargs):
self.request = kwargs.pop('request', None)
super(Test, self).__init__(*args, **kwargs)
user = models.ForeignKey(User, related_name='test')
viewed = models.BooleanField(default=False)
objects = TestManager()

I trying to access request object in my Django models so that I can get the currently logged in user with request.user.
Well a problem is that models are not per se used in the context of a request. One for example frequently defines custom commands to do bookkeeping, or one can define an API where for example the user is not present. The idea of the Django approach is that models should not be request-aware. Models define the "business logic" layer: the models define entities and how they interact. By not respecting these layers, one makes the application vulnerable for a lot of problems.
The blog you refer to aims to create what they call a global state (which is a severe anti-patten): you save the request in the middleware when the view makes a call, such that you can then fetch that object in the model layer. There are some problems with this approach: first of all, like already said, not all use cases are views, and thus not all use cases pass through the middleware. It is thus possible that the attribute does not exist when fetching it.
Furthermore it is not guaranteed that the request object is indeed the request object of the view. It is for example possible that we use the model layer with a command that thus does not pass through the middleware, in which case we should use the previous view request (so potentially with a different user). If the server processes multiple requests concurrently, it is also possible that a view will see a request that arrived a few nanoseconds later, and thus again take the wrong user. It is also possible that the authentication middleware is conditional, and thus that not all requests have a user attribute. In short there are more than enough scenario's where this can fail, and the results can be severe: people seeing, editing, or deleting data that they do not "own" (have no permission to view, edit, or delete).
You thus will need to pass the request, or user object to the user_test method. For example with:
from django.http import HttpRequest
class TestManager(models.Manager):
def user_test(self, request_or_user):
if isinstance(request_or_user, HttpRequest):
return self.filter(user=request_or_user.user, viewed=False)
else:
return self.filter(user=request_or_user, viewed=False)
one thus has to pass the request object from the view to the function. Even this is not really pure. A real pure approach would only accept a user object:
class TestManager(models.Manager):
def user_test(self, user):
return self.filter(user=user, viewed=False)
So in a view one can use this as:
def some_view(request):
some_tests = Test.objects.user_test(request.user)
# ...
# return Http response
For example if we want to render a template with this queryset, we can pass it like:
def some_view(request):
some_tests = Test.objects.user_test(request.user)
# ...
return render(request, 'my_template.html', {'some_tests': some_tests})

Django practice: smaller views or less views?

I'm developing an app with django, and I have a view where I use 2 return render_to_response, with two different html files, depending on the status of the user.
I was wondering if it would be a better practice to split my view into two different views or if I should keep a bigger view.
What are the pros and the cons of doing so?
Sorry if my question is not clear. Thank you very much for your advices.

There's no right or wrong answer to this question so your question may not be acceptable on stackoverflow, which usually is intended for questions/problems with specific technical solutions.
That said, here's my view on this topic - I personally like to keep my view function small and if further processing is required, split them out into smaller functions.
For example:-
#permission_required('organizations.organization_add_user')
def organization_add_user(request, org_slug):
org = get_object_or_404(Organization, slug=org_slug)
form = OrganizationAddUserForm(org=org)
if request.method == 'POST':
form = OrganizationAddUserForm(request.POST or None, request.FILES or None, org=org)
if form.is_valid():
cd = form.cleaned_data
# Create the user object & send out email for activation
user = create_user_from_manual(request, data=cd)
# Add user to OrganizationUser
org_user, created = OrganizationUser.objects.get_or_create(user=user,\
organization=org)
dept = org.departments.get(name=cd['department'])
org_user.departments.add(dept)
# Add user to the appropriate roles (OrganizationGroup) and groups (django groups)
org_groups = OrganizationGroup.objects.filter(group__name__in=cd['roles'], \
organization=org)
for g in org_groups:
user.groups.add(g.group)
return HttpResponse(reverse('add_user_success'))
template = 'organizations/add_user.html'
template_vars = {'form': form}
# override request with dictionary template_vars
template_vars = FormMediaRequestContext(request=request, dict=template_vars)
return render(request, template, template_vars)
FormMediaEquestContext is a class I import from another file and has its own logic which helps me to handle javascript and css files associated with my form (OrganizationAddUserForm).
create_user_from_manual is yet another function which is encapsulated separately and deals with the reasonably convolutated logic relating to creating a new user in my system and sending an invitation email to that new user.
And of course, I serve up a different template if this is the first time a user arrives on this "add user" page as opposed to redirecting to a completely different url with its own view function and template when the add user form is successfully executed.
By keeping our view functions reasonably small, I have an easier time tracking down bugs relating to specific functionality.
In addition, it is also a good way to "reuse" my utility functions such as the create_user_from_manual method should I need this same utility function in another view function.
However, at the end of the day, organizing code and encapsulating code is a judgement call that you get to make as you progress as a developer.

Why Django 1.4 per-site cache does not work correctly with CACHE_MIDDLEWARE_ANONYMOUS_ONLY?

I am working on a Django 1.4 project and writing one simple application using per-site cache as described here:
https://docs.djangoproject.com/en/dev/topics/cache/#the-per-site-cache
I have correctly setup a local Memcached server and confirmed the pages are being cached.
Then I set CACHE_MIDDLEWARE_ANONYMOUS_ONLY = True because I don't want to cache pages for authenticated users.
I'm testing with a simple view that returns a template with render_to_response and RequestContext to be able to access user information from the template and the caching works well so far, meaning it caches pages just for anonymous users.
And here's my problem. I created another view using a different template that doesn't access user information and noticed that the page was being cached even if the user was authenticated. After testing many things I found that authenticated users were getting a cached page if the template didn't print something from the user context variable. It's very simple to test: print the user on the template and the page won't be cached for an authenticated user, remove the user on the template, refresh the page while authenticated and check the HTTP headers and you will notice you're getting a cached page. You should clear the cache between changes to see the problem more clearly.
I tested a little more and found that I could get rid of the user in the template and print request.user right on the view (which prints to the development server console) and that also fixed the problem of showing a cached page to an authenticated user but that's an ugly hack.
A similar problem was reported here but never got an answer:
https://groups.google.com/d/topic/django-users/FyWmz9csy5g/discussion
I can probably write a conditional decorator to check if user.is_authenticated() and based on that use #never_cache on my view but it seems like that defeats the purpose of using per-site cache, doesn't it?
"""
A decorator to bypass per-site cache if the user is authenticated. Based on django.views.decorators.cache.never_cache.
See: http://stackoverflow.com/questions/12060036/why-django-1-4-per-site-cache-does-not-work-correctly-with-cache-middleware-anon
"""
from django.utils.decorators import available_attrs
from django.utils.cache import add_never_cache_headers
from functools import wraps
def conditional_cache(view_func):
"""
Checks the user and if it's authenticated pass it through never_cache.
This version uses functools.wraps for the wrapper function.
"""
#wraps(view_func, assigned=available_attrs(view_func))
def _wrapped_view_func(request, *args, **kwargs):
response = view_func(request, *args, **kwargs)
if request.user.is_authenticated():
add_never_cache_headers(response)
return response
return _wrapped_view_func
Any suggestions to avoid the need of an extra decorator will be appreciated.
Thanks!

Ok, I just confirmed my "problem" was caused by Django lazy loading the User object.
To confirm it, I just added something like this to my view:
test_var = "some text" + request.user
And I got an error message telling me I couldn't concatenate an str to a SimpleLazyObject. At this point the lazy loading logic hasn't got a real User object yet.
To bypass the lazy loading, hence return a non-cache view for authenticated users, I just needed to access some method or attribute to triggers an actual query on the User object. I ended up with this, which I think it's the simplest way:
bypass_lazyload = request.user.is_authenticated()
My conditional_cache decorator is no longer needed, although it was an interesting exercise.
I may not need to do this when I finish working with my views as I'll access some user methods and attributes on my templates anyway but it's good to know what was going on.
Regards.

Django - Multiple Sites Site Caching

I have a number of sites under one Django application that I would like to implement site wide caching on. However it is proving to be a real hassle.
what happens is that settings.CACHE_MIDDLEWARE_KEY_PREFIX is set once on startup, and I cannot go ahead and change it depending on what the current site is. As a result if a page of url http://website1.com/abc/ is cached then http://website2.com/abc/ renders the cached version of http://website1.com/abc/. Both these websites are running on the same Django instance as this is what Django Sites appears to allow us to do.
Is this an incorrect approach? Because I cannot dynamically set CACHE_MIDDLEWARE_KEY_PREFIX during runtime I am unable to cache multiple sites using Django's Site wide caching. I also am unable to do this for template and view caching.
I get the impression that the way this really needs to be setup is that each site needs its own Django instance which is pretty much identical except for the settings file, which in my case will differ only by the value of CACHE_MIDDLEWARE_KEY_PREFIX. These Django instances all read and write to the same database. This concerns me as it could create a number of new issues.
Am I going down the right track or am I mistaken about how multi site architecture needs to work? I have checked the Django docs and there is not real mention of how to handle caching (that isn't low level caching) for Django applications that serve multiple sites.

(Disclaimer: the following is purely speculation and has not been tested. Consume with a pinch of salt.)
It might be possible to use the vary_on_headers view decorator to include the 'Host' header in the cache key. That should result in cache keys that include the HTTP Host header, thus effectively isolating the caches for your sites.
#vary_on_headers('Host')
def my_view(request):
# ....
Of course, that will only work on a per-view basis, and having to add a decorator to all views can be a big hassle.
Digging into the source of #vary_on_headers reveals the use of patch_vary_headers() which one might be able to use in a middleware to apply the same behaviour on a site level. Something along the lines of:
from django.utils.cache import patch_vary_headers
class VaryByHostMiddleware(object):
def process_response(self, request, response):
patch_vary_headers(response, ('Host',))
return response

I faced this problem recently. What I did based on the documentation was to create a custom method to add the site id to the key used to cache the view.
In settings.py add the KEY_FUNCTION argument:
CACHES = {
'default': {
'BACKEND': 'path.to.backend',
'LOCATION': 'path.to.location',
'TIMEOUT': 60,
'KEY_FUNCTION': 'path.to.custom.make_key_per_site',
'OPTIONS': {
'MAX_ENTRIES': 1000
}
}
}
And my custom make_key method:
def make_key_per_site(key, key_prefix, version):
site_id = ''
try:
site = get_current_site() # Whatever you use to get your site's data
site_id = site['id']
except:
pass
return ':'.join([key_prefix, site_id, str(version), key])

You need to change get_full_path to build_absolute_uri in django.util.cache
def _generate_cache_header_key(key_prefix, request):
"""Returns a cache key for the header cache."""
#path = md5_constructor(iri_to_uri(request.get_full_path()))
path = md5_constructor(iri_to_uri(request.build_absolute_uri())) # patch using full path
cache_key = 'views.decorators.cache.cache_header.%s.%s' % (
key_prefix, path.hexdigest())
return _i18n_cache_key_suffix(request, cache_key)
def _generate_cache_key(request, method, headerlist, key_prefix):
"""Returns a cache key from the headers given in the header list."""
ctx = md5_constructor()
for header in headerlist:
value = request.META.get(header, None)
if value is not None:
ctx.update(value)
#path = md5_constructor(iri_to_uri(request.get_full_path()))
path = md5_constructor(iri_to_uri(request.build_absolute_uri()))
cache_key = 'views.decorators.cache.cache_page.%s.%s.%s.%s' % (
key_prefix, request.method, path.hexdigest(), ctx.hexdigest())
return _i18n_cache_key_suffix(request, cache_key)
Or create you own slightly changed cache middleware for multisite.
http://macrotoma.blogspot.com/2012/06/custom-multisite-caching-on-django.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js