Technique for subclassing Django UpdateCacheMiddleware and FetchFromCacheMiddleware - django

I've used the UpdateCacheMiddleware and FetchFromCacheMiddleware MiddleWare to enable site-wide anonymous caching to varying levels of success.
The biggest problem is that the Middleware only caches an anonymous user's first request. Since a session_id cookie is set on that first response, subsequent requests by that anonymous user do not hit the cache as a result of the view level cache varying on Headers.
My webpages do not meaningfully vary among anonymous users and, in so far as they do vary, I can handle that via Ajax. As a result, I decided to try to subclass Django's caching Middleware to no longer vary on Header. Instead, it varies on Anonymous vs. LoggedIn Users. Because I am using the Auth backend, and that handler occurs before fetching from the cache, it seems to work.
class AnonymousUpdateCacheMiddleware(UpdateCacheMiddleware):
def process_response(self, request, response):
"""
Sets the cache, if needed.
We are overriding it in order to change the behavior of learn_cache_key().
"""
if not self._should_update_cache(request, response):
# We don't need to update the cache, just return.
return response
if not response.status_code == 200:
return response
timeout = get_max_age(response)
if timeout == None:
timeout = self.cache_timeout
elif timeout == 0:
# max-age was set to 0, don't bother caching.
return response
patch_response_headers(response, timeout)
if timeout:
######### HERE IS WHERE IT REALLY GOES DOWN #######
cache_key = self.learn_cache_key(request, response, self.cache_timeout, self.key_prefix, cache=self.cache)
if hasattr(response, 'render') and callable(response.render):
response.add_post_render_callback(
lambda r: self.cache.set(cache_key, r, timeout)
)
else:
self.cache.set(cache_key, response, timeout)
return response
def learn_cache_key(self, request, response, timeout, key_prefix, cache=None):
"""_generate_cache_header_key() creates a key for the given request path, adjusted for locales.
With this key, a new cache key is set via _generate_cache_key() for the HttpResponse
The subsequent anonymous request to this path hits the FetchFromCacheMiddleware in the
request capturing phase, which then looks up the headerlist value cached here on the initial response.
FetchFromMiddleWare calcuates a cache_key based on the values of the listed headers using _generate_cache_key
and then looks for the response stored under that key. If the headers are the same as those
set here, there will be a cache hit and the cached HTTPResponse is returned.
"""
key_prefix = key_prefix or settings.CACHE_MIDDLEWARE_KEY_PREFIX
cache_timeout = self.cache_timeout or settings.CACHE_MIDDLEWARE_SECONDS
cache = cache or get_cache(settings.CACHE_MIDDLEWARE_ALIAS)
cache_key = _generate_cache_header_key(key_prefix, request)
# Django normally varies caching by headers so that authed/anonymous users do not see same pages
# This makes Google Analytics cookies break caching;
# It also means that different anonymous session_ids break caching, so only first anon request works
# In this subclass, we are ignoring headers and instead varying on authed vs. anonymous users
# Alternatively, we could also strip cookies potentially for the same outcome
# if response.has_header('Vary'):
# headerlist = ['HTTP_' + header.upper().replace('-', '_')
# for header in cc_delim_re.split(response['Vary'])]
# else:
headerlist = []
cache.set(cache_key, headerlist, cache_timeout)
return _generate_cache_key(request, request.method, headerlist, key_prefix)
The Fetcher, which is responsible for retrieving the page from the cache, looks like this
class AnonymousFetchFromCacheMiddleware(FetchFromCacheMiddleware):
def process_request(self, request):
"""
Checks whether the page is already cached and returns the cached
version if available.
"""
if request.user.is_authenticated():
request._cache_update_cache = False
return None
else:
return super(SmarterFetchFromCacheMiddleware, self).process_request(request)
There was a lot of copying for UpdateCacheMiddleware, obviously. I couldn't figure out a better hook to make this cleaner.
Does this generally seem like a good approach? Any obvious issues that come to mind?
Thanks,
Ben

You may work around this by temporarily removing unwanted vary fields from response['Vary']:
from django.utils.cache import cc_delim_re
class AnonymousUpdateCacheMiddleware(UpdateCacheMiddleware):
def process_response(self, request, response):
vary = None
if not request.user.is_authenticated() and response.has_header('Vary'):
vary = response['Vary']
# only hide cookie here, add more as your usage
response['Vary'] = ', '.join(
filter(lambda v: v != 'cookie', cc_delim_re.split(vary))
response = super(AnonymousUpdateCacheMiddleware, self).process_response(request, response)
if vary is not None:
response['Vary'] = vary
return response
Also, set CACHE_MIDDLEWARE_ANONYMOUS_ONLY = True in settings to prevent cache for authenticated users.

Related

Is there a way to get a referring URL via a custom HTTP header?

I am currently using the following function to get a referring view:
def get_referer_view(request, default=None):
referer = request.META.get('HTTP_REFERER')
if not referer:
return default
# remove the protocol and split the url at the slashes
referer = re.sub('^https?:\/\/', '', referer).split('/')
if referer[0] != request.META.get('SERVER_NAME'):
return default
# add the slash at the relative path's view and finished
referer = u'/' + u'/'.join(referer[1:])
return referer
If I redirected the view as a result of programmatic logic, e.g...
return HttpResponseRedirect('dashboard')
...is there a way to get the referring view without using HTTP_REFERER so that I can use that variable in the redirected view? This is not always set in the headers of the browser.
Note because the views are redirected pro grammatically, I can't use POST to collect the data.
Perhaps its possible to set and retrieve a custom header somehow?
Use Django's middleware component.
https://docs.djangoproject.com/en/3.0/topics/http/middleware/
Something like this should work:
class HTTPReferer:
def __init__(self, get_response):
self.get_response = get_response
def __call__old(self, request):
# old
referer = request.META.get('HTTP_REFERER', None)
request.referer = referer
# other business logic as you like it
response = self.get_response(request)
return response
def __call__(self, request):
# reflecting last edit
path = request.path
response = self.get_response(request)
response['previous_path'] = path
return response
So you could tie any information you need to every request/response cycle in Django (you could also set custom headers, etc...)
In the example above HTTP_REFERER will be available in request object as referer.
EDIT: I think, you concern is that HTTP_REFERER is not always populated by the client; so you could tie HttpRequest.path to every request made to a custom header. If path is not enough, you could save the request args too. That's all, I think. Then you have a custom header populated by the last path. Further on, if this is not enough, you could use Django's URL resolver.
Since you control the page making the request, sure. Add the current URL to some header and extract it in your function, similar to this: Add request header before redirection
so instead of this:
def current_view():
...
return HttpResponseRedirect('dashboard')
do something like this:
def current_view():
...
response = redirect('/dashboard')
response['source-view'] = request.resolver_match.view_name
return response
This should produce the 302 with the custom header source-view, which you can extract in the receiving view
For those interested, here's the solution I derived. The trick is to set a cookie after the first request to store the view_name or path and then call it and save to the request before rendering the view.
class MyMiddleware:
def __init__(self, get_response):
# One-time configuration and initialization.
self.get_response = get_response
def __call__(self, request):
# Code to be executed for each request before
# the view (and later middleware) are called.
# Add the referer cookie be accessible in request.response
request.referer_view = request.COOKIES.get('referer_view', None)
request.referer_path = request.COOKIES.get('referer_path', None)
print('request.referer_view', request.referer_view)
print('request.referer_path', request.referer_path)
response = self.get_response(request)
# Code to be executed for each request/response after
# the view is called.
# Set a cookie with the current view name that is cleared each time the view changes
response.set_cookie('referer_view', request.resolver_match.view_name)
response.set_cookie('referer_path', request.path)
return response
The values then update in this cycle each time the view is changed.

Django cache_page with keys [duplicate]

The #cache_page decorator is awesome. But for my blog I would like to keep a page in cache until someone comments on a post. This sounds like a great idea as people rarely comment so keeping the pages in memcached while nobody comments would be great. I'm thinking that someone must have had this problem before? And this is different than caching per url.
So a solution I'm thinking of is:
#cache_page( 60 * 15, "blog" );
def blog( request ) ...
And then I'd keep a list of all cache keys used for the blog view and then have way of expire the "blog" cache space. But I'm not super experienced with Django so I'm wondering if someone knows a better way of doing this?
This solution works for django versions before 1.7
Here's a solution I wrote to do just what you're talking about on some of my own projects:
def expire_view_cache(view_name, args=[], namespace=None, key_prefix=None):
"""
This function allows you to invalidate any view-level cache.
view_name: view function you wish to invalidate or it's named url pattern
args: any arguments passed to the view function
namepace: optioal, if an application namespace is needed
key prefix: for the #cache_page decorator for the function (if any)
"""
from django.core.urlresolvers import reverse
from django.http import HttpRequest
from django.utils.cache import get_cache_key
from django.core.cache import cache
# create a fake request object
request = HttpRequest()
# Loookup the request path:
if namespace:
view_name = namespace + ":" + view_name
request.path = reverse(view_name, args=args)
# get cache key, expire if the cached item exists:
key = get_cache_key(request, key_prefix=key_prefix)
if key:
if cache.get(key):
# Delete the cache entry.
#
# Note that there is a possible race condition here, as another
# process / thread may have refreshed the cache between
# the call to cache.get() above, and the cache.set(key, None)
# below. This may lead to unexpected performance problems under
# severe load.
cache.set(key, None, 0)
return True
return False
Django keys these caches of the view request, so what this does is creates a fake request object for the cached view, uses that to fetch the cache key, then expires it.
To use it in the way you're talking about, try something like:
from django.db.models.signals import post_save
from blog.models import Entry
def invalidate_blog_index(sender, **kwargs):
expire_view_cache("blog")
post_save.connect(invalidate_portfolio_index, sender=Entry)
So basically, when ever a blog Entry object is saved, invalidate_blog_index is called and the cached view is expired. NB: haven't tested this extensively, but it's worked fine for me so far.
The cache_page decorator will use CacheMiddleware in the end which will generate a cache key based on the request (look at django.utils.cache.get_cache_key) and the key_prefix ("blog" in your case). Note that "blog" is only a prefix, not the whole cache key.
You can get notified via django's post_save signal when a comment is saved, then you can try to build the cache key for the appropriate page(s) and finally say cache.delete(key).
However this requires the cache_key, which is constructed with the request for the previously cached view. This request object is not available when a comment is saved. You could construct the cache key without the proper request object, but this construction happens in a function marked as private (_generate_cache_header_key), so you are not supposed to use this function directly. However, you could build an object that has a path attribute that is the same as for the original cached view and Django wouldn't notice, but I don't recommend that.
The cache_page decorator abstracts caching quite a bit for you and makes it hard to delete a certain cache object directly. You could make up your own keys and handle them in the same way, but this requires some more programming and is not as abstract as the cache_page decorator.
You will also have to delete multiple cache objects when your comments are displayed in multiple views (i.e. index page with comment counts and individual blog entry pages).
To sum up: Django does time based expiration of cache keys for you, but custom deletion of cache keys at the right time is more tricky.
I wrote Django-groupcache for this kind of situations (you can download the code here). In your case, you could write:
from groupcache.decorators import cache_tagged_page
#cache_tagged_page("blog", 60 * 15)
def blog(request):
...
From there, you could simply do later on:
from groupcache.utils import uncache_from_tag
# Uncache all view responses tagged as "blog"
uncache_from_tag("blog")
Have a look at cache_page_against_model() as well: it's slightly more involved, but it will allow you to uncache responses automatically based on model entity changes.
With the latest version of Django(>=2.0) what you are looking for is very easy to implement:
from django.utils.cache import learn_cache_key
from django.core.cache import cache
from django.views.decorators.cache import cache_page
keys = set()
#cache_page( 60 * 15, "blog" );
def blog( request ):
response = render(request, 'template')
keys.add(learn_cache_key(request, response)
return response
def invalidate_cache()
cache.delete_many(keys)
You can register the invalidate_cache as a callback when someone updates a post in the blog via a pre_save signal.
This won't work on django 1.7; as you can see here https://docs.djangoproject.com/en/dev/releases/1.7/#cache-keys-are-now-generated-from-the-request-s-absolute-url the new cache keys are generated with the full URL, so a path-only fake request won't work. You must setup properly request host value.
fake_meta = {'HTTP_HOST':'myhost',}
request.META = fake_meta
If you have multiple domains working with the same views, you should cycle them in the HTTP_HOST, get proper key and do the clean for each one.
Django view cache invalidation for v1.7 and above. Tested on Django 1.9.
def invalidate_cache(path=''):
''' this function uses Django's caching function get_cache_key(). Since 1.7,
Django has used more variables from the request object (scheme, host,
path, and query string) in order to create the MD5 hashed part of the
cache_key. Additionally, Django will use your server's timezone and
language as properties as well. If internationalization is important to
your application, you will most likely need to adapt this function to
handle that appropriately.
'''
from django.core.cache import cache
from django.http import HttpRequest
from django.utils.cache import get_cache_key
# Bootstrap request:
# request.path should point to the view endpoint you want to invalidate
# request.META must include the correct SERVER_NAME and SERVER_PORT as django uses these in order
# to build a MD5 hashed value for the cache_key. Similarly, we need to artificially set the
# language code on the request to 'en-us' to match the initial creation of the cache_key.
# YMMV regarding the language code.
request = HttpRequest()
request.META = {'SERVER_NAME':'localhost','SERVER_PORT':8000}
request.LANGUAGE_CODE = 'en-us'
request.path = path
try:
cache_key = get_cache_key(request)
if cache_key :
if cache.has_key(cache_key):
cache.delete(cache_key)
return (True, 'successfully invalidated')
else:
return (False, 'cache_key does not exist in cache')
else:
raise ValueError('failed to create cache_key')
except (ValueError, Exception) as e:
return (False, e)
Usage:
status, message = invalidate_cache(path='/api/v1/blog/')
I had same problem and I didn't want to mess with HTTP_HOST, so I created my own cache_page decorator:
from django.core.cache import cache
def simple_cache_page(cache_timeout):
"""
Decorator for views that tries getting the page from the cache and
populates the cache if the page isn't in the cache yet.
The cache is keyed by view name and arguments.
"""
def _dec(func):
def _new_func(*args, **kwargs):
key = func.__name__
if kwargs:
key += ':' + ':'.join([kwargs[key] for key in kwargs])
response = cache.get(key)
if not response:
response = func(*args, **kwargs)
cache.set(key, response, cache_timeout)
return response
return _new_func
return _dec
To expired page cache just need to call:
cache.set('map_view:' + self.slug, None, 0)
where self.slug - param from urls.py
url(r'^map/(?P<slug>.+)$', simple_cache_page(60 * 60 * 24)(map_view), name='map'),
Django 1.11, Python 3.4.3
FWIW I had to modify mazelife's solution to get it working:
def expire_view_cache(view_name, args=[], namespace=None, key_prefix=None, method="GET"):
"""
This function allows you to invalidate any view-level cache.
view_name: view function you wish to invalidate or it's named url pattern
args: any arguments passed to the view function
namepace: optioal, if an application namespace is needed
key prefix: for the #cache_page decorator for the function (if any)
from: http://stackoverflow.com/questions/2268417/expire-a-view-cache-in-django
added: method to request to get the key generating properly
"""
from django.core.urlresolvers import reverse
from django.http import HttpRequest
from django.utils.cache import get_cache_key
from django.core.cache import cache
# create a fake request object
request = HttpRequest()
request.method = method
# Loookup the request path:
if namespace:
view_name = namespace + ":" + view_name
request.path = reverse(view_name, args=args)
# get cache key, expire if the cached item exists:
key = get_cache_key(request, key_prefix=key_prefix)
if key:
if cache.get(key):
cache.set(key, None, 0)
return True
return False
Instead of using the cache page decorator, you could manually cache the blog post object (or similar) if there are no comments, and then when there's a first comment, re-cache the blog post object so that it's up to date (assuming the object has attributes that reference any comments), but then just let that cached data for the commented blog post expire and then no bother re-cacheing...
Instead of explicit cache expiration you could probably use new "key_prefix" every time somebody comment the post. E.g. it might be datetime of the last post's comment (you could even combine this value with the Last-Modified header).
Unfortunately Django (including cache_page()) does not support dynamic "key_prefix"es (checked on Django 1.9) but there is workaround exists. You can implement your own cache_page() which may use extended CacheMiddleware with dynamic "key_prefix" support included. For example:
from django.middleware.cache import CacheMiddleware
from django.utils.decorators import decorator_from_middleware_with_args
def extended_cache_page(cache_timeout, key_prefix=None, cache=None):
return decorator_from_middleware_with_args(ExtendedCacheMiddleware)(
cache_timeout=cache_timeout,
cache_alias=cache,
key_prefix=key_prefix,
)
class ExtendedCacheMiddleware(CacheMiddleware):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
if callable(self.key_prefix):
self.key_function = self.key_prefix
def key_function(self, request, *args, **kwargs):
return self.key_prefix
def get_key_prefix(self, request):
return self.key_function(
request,
*request.resolver_match.args,
**request.resolver_match.kwargs
)
def process_request(self, request):
self.key_prefix = self.get_key_prefix(request)
return super().process_request(request)
def process_response(self, request, response):
self.key_prefix = self.get_key_prefix(request)
return super().process_response(request, response)
Then in your code:
from django.utils.lru_cache import lru_cache
#lru_cache()
def last_modified(request, blog_id):
"""return fresh key_prefix"""
#extended_cache_page(60 * 15, key_prefix=last_modified)
def view_blog(request, blog_id):
"""view blog page with comments"""
Most of the solutions above didn't work in our case because we use https. The source code for get_cache_key reveals that it uses request.get_absolute_uri() to generate the cache key.
The default HttpRequest class sets the scheme as http. Thus we need to override it to use https for our dummy request object.
This is the code that works fine for us :)
from django.core.cache import cache
from django.http import HttpRequest
from django.utils.cache import get_cache_key
class HttpsRequest(HttpRequest):
#property
def scheme(self):
return "https"
def invalidate_cache_page(
path,
query_params=None,
method="GET",
):
request = HttpsRequest()
# meta information can be checked from error logs
request.META = {
"SERVER_NAME": "www.yourwebsite.com",
"SERVER_PORT": "443",
"QUERY_STRING": query_params,
}
request.path = path
key = get_cache_key(request, method=method)
if cache.has_key(key):
cache.delete(key)
Now I can use this utility function to invalidate the cache from any of our views:
page = reverse('url_name', kwargs={'id': obj.id})
invalidate_cache_page(path)
Duncan's answer works well with Django 1.9. But if we need invalidate url with GET-parameter we have to make a little changes in request.
Eg for .../?mykey=myvalue
request.META = {'SERVER_NAME':'127.0.0.1','SERVER_PORT':8000, 'REQUEST_METHOD':'GET', 'QUERY_STRING': 'mykey=myvalue'}
request.GET.__setitem__(key='mykey', value='myvalue')
I struggled with a similar situation and here is the solution I came up with, I started it on an earlier version of Django but it is currently in use on version 2.0.3.
First issue: when you set things to be cached in Django, it sets headers so that downstream caches -- including the browser cache -- cache your page.
To override that, you need to set middleware. I cribbed this from elsewhere on StackOverflow, but can't find it at the moment. In appname/middleware.py:
from django.utils.cache import add_never_cache_headers
class Disable(object):
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
response = self.get_response(request)
add_never_cache_headers(response)
return response
Then in settings.py, to MIDDLEWARE, add:
'appname.middleware.downstream_caching.Disable',
Keep in mind that this approach completely disables downstream caching, which may not be what you want.
Finally, I added to my views.py:
def expire_page(request, path=None, query_string=None, method='GET'):
"""
:param request: "real" request, or at least one providing the same scheme, host, and port as what you want to expire
:param path: The path you want to expire, if not the path on the request
:param query_string: The query string you want to expire, as opposed to the path on the request
:param method: the HTTP method for the page, if not GET
:return: None
"""
if query_string is not None:
request.META['QUERY_STRING'] = query_string
if path is not None:
request.path = path
request.method = method
# get_raw_uri and method show, as of this writing, everything used in the cache key
# print('req uri: {} method: {}'.format(request.get_raw_uri(), request.method))
key = get_cache_key(request)
if key in cache:
cache.delete(key)
I didn't like having to pass in a request object, but as of this writing, it provides the scheme/protocol, host, and port for the request, pretty much any request object for your site/app will do, as long as you pass in the path and query string.
One more updated version of Duncan's answer: had to figure out correct meta fields: (tested on Django 1.9.8)
def invalidate_cache(path=''):
import socket
from django.core.cache import cache
from django.http import HttpRequest
from django.utils.cache import get_cache_key
request = HttpRequest()
domain = 'www.yourdomain.com'
request.META = {'SERVER_NAME': socket.gethostname(), 'SERVER_PORT':8000, "HTTP_HOST": domain, 'HTTP_ACCEPT_ENCODING': 'gzip, deflate, br'}
request.LANGUAGE_CODE = 'en-us'
request.path = path
try:
cache_key = get_cache_key(request)
if cache_key :
if cache.has_key(cache_key):
cache.delete(cache_key)
return (True, 'successfully invalidated')
else:
return (False, 'cache_key does not exist in cache')
else:
raise ValueError('failed to create cache_key')
except (ValueError, Exception) as e:
return (False, e)
The solution is simple, and do not require any additional work.
Example
#cache_page(60 * 10)
def our_team(request, sorting=None):
...
This will set the response to the cache with the default key.
Expire a view cache
from django.utils.cache import get_cache_key
from django.core.cache import cache
def our_team(request, sorting=None):
# This will remove the cache value and set it to None
cache.set(get_cache_key(request), None)
Simple, Clean, Fast.

Huge Django Session table, normal behaviour or bug?

Perhaps this is completely normal behaviour, but I feel like the django_session table is much larger than it should have to be.
First of all, I run the following cleanup command daily so the size is not caused by expired sessions:
DELETE FROM %s WHERE expire_date < NOW()
The numbers:
We've got about 5000 unique visitors (bots excluded) every day.
The SESSION_COOKIE_AGE is set to the default, 2 weeks
The table has a little over 1,000,000 rows
So, I'm guessing that Django also generates session keys for all bots that visits the site and that the bots don't store the cookies so it continuously generates new cookies.
But... is this normal behaviour? Is there a setting so Django won't generate sessions for anonymous users, or atleast... no sessions for users that aren't using sessions?
After a bit of debugging I've managed to trace cause of the problem.
One of my middlewares (and most of my views) have a request.user.is_authenticated() in them.
The django.contrib.auth middleware sets request.user to LazyUser()
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/middleware.py?rev=14919#L13 (I don't see why there is a return None there, but ok...)
class AuthenticationMiddleware(object):
def process_request(self, request):
assert hasattr(request, 'session'), "The Django authentication middleware requires session middleware to be installed. Edit your MIDDLEWARE_CLASSES setting to insert 'django.contrib.sessions.middleware.SessionMiddleware'."
request.__class__.user = LazyUser()
return None
The LazyUser calls get_user(request) to get the user:
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/middleware.py?rev=14919#L5
class LazyUser(object):
def __get__(self, request, obj_type=None):
if not hasattr(request, '_cached_user'):
from django.contrib.auth import get_user
request._cached_user = get_user(request)
return request._cached_user
The get_user(request) method does a user_id = request.session[SESSION_KEY]
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/auth/init.py?rev=14919#L100
def get_user(request):
from django.contrib.auth.models import AnonymousUser
try:
user_id = request.session[SESSION_KEY]
backend_path = request.session[BACKEND_SESSION_KEY]
backend = load_backend(backend_path)
user = backend.get_user(user_id) or AnonymousUser()
except KeyError:
user = AnonymousUser()
return user
Upon accessing the session sets accessed to true:
Source: http://code.djangoproject.com/browser/django/trunk/django/contrib/sessions/backends/base.py?rev=14919#L183
def _get_session(self, no_load=False):
"""
Lazily loads session from storage (unless "no_load" is True, when only
an empty dict is stored) and stores it in the current instance.
"""
self.accessed = True
try:
return self._session_cache
except AttributeError:
if self._session_key is None or no_load:
self._session_cache = {}
else:
self._session_cache = self.load()
return self._session_cache
And that causes the session to initialize. The bug was caused by a faulty session backend that also generates a session when accessed is set to true...
Is it possible for robots to access any page where you set anything in a user session (even for anonymous users), or any page where you use session.set_test_cookie() (for example Django's default login view in calls this method)? In both of these cases a new session object is created. Excluding such URLs in robots.txt should help.
For my case, I wrongly set SESSION_SAVE_EVERY_REQUEST = True in settings.py without understanding the exact meaning.
Then every request to my django service would generate a session entry, especially the heartbeat test request from upstream load balancers. After several days' running, django_session table turned to a huge one.
Django offers a management command to cleanup these expired sessions!

Django: HTTPS for just login page?

I just added this SSL middleware to my site http://www.djangosnippets.org/snippets/85/ which I used to secure only my login page so that passwords aren't sent in clear-text. Of course, when the user navigates away from that page he's suddenly logged out. I understand why this happens, but is there a way to pass the cookie over to HTTP so that users can stay logged in?
If not, is there an easy way I can use HTTPS for the login page (and maybe the registration page), and then have it stay on HTTPS if the user is logged in, but switch back to HTTP if the user doesn't log in?
There are a lot of pages that are visible to both logged in users and not, so I can't just designate certain pages as HTTP or HTTPS.
Actually, modifying the middleware like so seems to work pretty well:
class SSLRedirect:
def process_view(self, request, view_func, view_args, view_kwargs):
if 'SSL' in view_kwargs:
secure = view_kwargs['SSL']
del view_kwargs['SSL']
else:
secure = False
if request.user.is_authenticated():
secure = True
if not secure == self._is_secure(request):
return self._redirect(request, secure)
def _is_secure(self, request):
if request.is_secure():
return True
#Handle the Webfaction case until this gets resolved in the request.is_secure()
if 'HTTP_X_FORWARDED_SSL' in request.META:
return request.META['HTTP_X_FORWARDED_SSL'] == 'on'
return False
def _redirect(self, request, secure):
protocol = secure and "https://secure" or "http://www"
newurl = "%s.%s%s" % (protocol,settings.DOMAIN,request.get_full_path())
if settings.DEBUG and request.method == 'POST':
raise RuntimeError, \
"""Django can't perform a SSL redirect while maintaining POST data.
Please structure your views so that redirects only occur during GETs."""
return HttpResponsePermanentRedirect(newurl)
Better is to secure everything. Half secure seems secure, but is totally not. To put it blank: by doing so you are deceiving your end users by giving them a false sense of security.
So either don't use ssl or better: use it all the way. The overhead for both server and end user is negligible.

Django Cookies, how can I set them?

I have a web site which shows different content based on a location
the visitor chooses. e.g: User enters in 55812 as the zip. I know what
city and area lat/long. that is and give them their content pertinent
to that area. My question is how can I store this in a cookie so that
when they return they are not required to always enter their zip code?
I see it as follows:
Set persistent cookie based on their area.
When they return read cookie, grab zipcode.
Return content based on the zip code in their cookie.
I can't seem to find any solid information on setting a cookie. Any
help is greatly appreciated.
Using Django's session framework should cover most scenarios, but Django also now provide direct cookie manipulation methods on the request and response objects (so you don't need a helper function).
Setting a cookie:
def view(request):
response = HttpResponse('blah')
response.set_cookie('cookie_name', 'cookie_value')
Retrieving a cookie:
def view(request):
value = request.COOKIES.get('cookie_name')
if value is None:
# Cookie is not set
# OR
try:
value = request.COOKIES['cookie_name']
except KeyError:
# Cookie is not set
UPDATE : check Peter's answer below for a builtin solution :
This is a helper to set a persistent cookie:
import datetime
def set_cookie(response, key, value, days_expire=7):
if days_expire is None:
max_age = 365 * 24 * 60 * 60 # one year
else:
max_age = days_expire * 24 * 60 * 60
expires = datetime.datetime.strftime(
datetime.datetime.utcnow() + datetime.timedelta(seconds=max_age),
"%a, %d-%b-%Y %H:%M:%S GMT",
)
response.set_cookie(
key,
value,
max_age=max_age,
expires=expires,
domain=settings.SESSION_COOKIE_DOMAIN,
secure=settings.SESSION_COOKIE_SECURE or None,
)
Use the following code before sending a response.
def view(request):
response = HttpResponse("hello")
set_cookie(response, 'name', 'jujule')
return response
UPDATE : check Peter's answer below for a builtin solution :
You could manually set the cookie, but depending on your use case (and if you might want to add more types of persistent/session data in future) it might make more sense to use Django's sessions feature. This will let you get and set variables tied internally to the user's session cookie. Cool thing about this is that if you want to store a lot of data tied to a user's session, storing it all in cookies will add a lot of weight to HTTP requests and responses. With sessions the session cookie is all that is sent back and forth (though there is the overhead on Django's end of storing the session data to keep in mind).
Anyone interested in doing this should read the documentation of the Django Sessions framework. It stores a session ID in the user's cookies, but maps all the cookies-like data to your database. This is an improvement on the typical cookies-based workflow for HTTP requests.
Here is an example with a Django view ...
def homepage(request):
request.session.setdefault('how_many_visits', 0)
request.session['how_many_visits'] += 1
print(request.session['how_many_visits'])
return render(request, 'home.html', {})
If you keep visiting the page over and over, you'll see the value start incrementing up from 1 until you clear your cookies, visit on a new browser, go incognito, or do anything else that sidesteps Django's Session ID cookie.
In addition to jujule's answer below you can find a solution that shows how to set a cookie to Class Based Views responses. You can apply this solution to your view classes that extends from TemplateView, ListView or View.
Below a modified version of jujule's persistent cookie setter method:
import datetime
from django.http import HttpResponse
def set_cookie(
response: HttpResponse,
key: str,
value: str,
cookie_host: str,
days_expire: int = 365,
):
max_age = days_expire * 24 * 60 * 60
expires = datetime.datetime.strftime(
datetime.datetime.utcnow() + datetime.timedelta(days=days_expire), "%a, %d-%b-%Y %H:%M:%S GMT",
)
domain = cookie_host.split(":")[0]
response.set_cookie(
key,
value,
max_age=max_age,
expires=expires,
domain=domain,
secure=False,
)
And sample class based view example that adds a cookie using persistent cookie setter
class BaseView(TemplateView):
template_name = "YOUR_TEMPLATE_FILE_PATH"
def get(self, request, *args, **kwargs):
response = super().get(request, *args, **kwargs)
set_cookie(
response=response,
key="COOKIE_KEY",
value="COOKIE_VALUE",
cookie_host=request.get_host(),
days_expire=7,
)
return response