Django cache.clear() not working (LocMemCache, AWS) - django

Background
I have a website running Django 2.0 on AWS ElasticBeanstalk. I have a couple views on my website that take some time to calculate, so I thought I'd look into some simple caching. I decided on LocMemCache because it looked like the quickest to set up that would meet my needs. (I'm using AWS, so using Memcached apparently requires ElastiCache, which adds cost and is additional setup overhead that I wanted to avoid.)
The views do not change often, and the site is not high-traffic, so I put long timeouts on the caches. There are three views where I have enabled caching:
A report generated inside a template – uses Template Fragment caching
A list of locations requested by AJAX and used in a JS library – uses per-view caching
A dynamically-generated binary file download – uses per-view caching
The caching is set up and works great.
The data that goes into these views is added and edited by other staff at my company, that are used to their changes appearing immediately. So in order to address questions such as, "I updated this data, why has the webpage not updated?" I wanted to create a "Clear Server Cache" button, accessible by staff, to force a cache reset.
The button is set up and functioning. It requests a view that calls cache.clear() from django.core.cache. I used the sledgehammer cache.clear() approach because the way to specify an individual per-view cache in code seems to be a bit clunky and convoluted, so the "clear it all" approach seemed adequate. And at the very least it should always "work" in the sense that all the data will get re-loaded again.
The Problem
When I use the button to call cache.clear(), it only clears the Template Fragment cache. It does not seem to clear the per-view caches. Why?
According to Django Documentation,
Be careful with this; clear() will remove everything from the cache, not just the keys set by your application.
So why is it not touching the per-view caches? Doesn't the warning seem to indicate that clear() is dangerous specifically because it's a sledgehammer and nothing at all is spared? What am I missing?
Does AWS use some kind of special memory that's immune to this sort of culling? (If this is the case, then why are the Template Fragments successfully cleared?) I did notice (and find it interesting) that the cache remains even after deploying a new image to the same environment.
I could switch to using Database caching, but I'd like to understand why this isn't working so I don't need to abandon LocMemCache as an option to ever use in the future.
I could also move the others to use Template Fragment caching, but if I ever expand the caching to fit other needs, I will want to be able to use per-view caching. Also, this solution would be less than ideal for a binary-file-download view.
settings.py
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
## 'LOCATION': '',
},
}
portfolio.html (Cache #1 – Template Fragment)
{% load static cache compress %}
...
<div id="total-portfolio-content">
{% cache 7200 portfolio %}{% include 'reports/total_portfolio/report_include.html' %}{% endcache %}
</div>
map/urls.py (Cache #2 – Per-view)
from django.conf.urls import include, url
app_name = 'map'
urlpatterns = [
## Yes, I know this uses the old-style url(). I have plans to upgrade the entire project.
url(r'^(?P<tg>[\w-]+)/data.geojson$',
cache_page(60 * 60 * 12)(
views.NamedGeoJSONLayerView.as_view(model=FacilityCoord)),
name='tg-data'),
]
resources/urls/__init__.py (Cache #3 – Per-view)
from django.conf.urls import include, url
app_name = 'resources'
urlpatterns = [
url(r'^download/$',
cache_page(60 * 60 * 12)(
views.DownloadMetricXLSX.as_view()),
name='download'),
]
myadmin/views.py (Cache Clear button)
from django.core.cache import cache
#staff_member_required(login_url=login_url)
def clear_cache(request):
cache.clear()
## And because that doesn't seem to work as advertised, I also tried....
## taken from <https://djangosnippets.org/snippets/1080/>
try:
cache._cache.clear() # in-memory caching
cache._expire_info.clear()
except AttributeError:
# I think this only applies to filesystem caching? Just grasping at straws.
old_freq = cache._cull_frequency
old_max = cache._max_entries
cache._max_entries = 0
cache._cull_frequency = 1
cache._cull()
cache._cull_frequency = old_freq
cache._max_entries = old_max
return JsonResponse({'success': True})

The problem is in the response headers. The cache_page decorator automatically adds a max-age option to the Cache-Control header in the response. So the cache clear was working properly, clearing the local memory on the server, but the user's browser was instructed not to ask the server for updated data for the duration of the timeout. And my browser was happily complying (even after Ctrl-F5).
Fortunately, there are other decorators you can use to deal with this without much difficulty, now that it's clear what's happening. Django provides a number of other decorators, such as cache_control or never_cache.
I ended up using never_cache, which turned the urls files into...
from django.conf.urls import include, url
from django.views.decorators.cache import never_cache, cache_page
app_name = 'map'
urlpatterns = [
url(r'^(?P<tg>[\w-]+)/data.geojson$',
never_cache(cache_page(60 * 60 * 12)(
views.NamedGeoJSONLayerView.as_view(model=FacilityCoord))),
name='tg-data'),
]
and
from django.conf.urls import include, url
from django.views.decorators.cache import never_cache, cache_page
app_name = 'resources'
urlpatterns = [
url(r'^download/$',
never_cache(cache_page(60 * 60 * 12)(
views.DownloadMetricXLSX.as_view())),
name='download'),
]

Related

What would be considered the root directory to a third party javascript in Django

When using Send Pulse they instruct you place a javascript in the head of your template. This javascript then refers to 2 other scripts that they instruct you to place in the root of you web site. Where can these be placed so that their javascript will automatically find them in the "web root"?
The point is that the service is expecting to find the scripts directly under /, eg mywebsite.com/script1.js. But in Django static files are usually under /static or the equivalent. There is nowhere you can just "place" them to get them to appear at the right URL.
But that doesn't mean you can't do it. The best thing to do would be to add an explicit mapping in your web server (nginx or Apache) for those scripts. For example, in nginx:
location /script1.js { alias /path/to/staticdir/script1.js; }
You have a number of possible options. In no particular order:
#daniel-roseman's suggestion of using a rewrite/alias configured in your webserver
Modify the javascript that you're going to place in the head of your page, so that it loads the scripts from your /static folder (followed by placing those scripts there, of course)
Use an HttpResponse to return these script(s) from a view function, and then map the target URLs (like mywebsite.com/script1.js) to the view.
This might look like:
# urls.py - assuming >= Django 2.0
from django.urls import path
urlpatterns = [
# place script1.js into your static folder, then reference its location within the static folder
path('script1.js', views.javascript_loader, {'script_path': 'script1.js'}),
]
# views.py
from django.http import HttpResponse
from django.conf import settings
import os
def javascript_loader(request, script_path):
# note: you should cache this content for the use case you've described, just
# consider this an illustration of how to load/return content from a static
# file using a view.
full_script_path = os.path.join(settings.STATIC_ROOT, script_path)
with open(full_script_path, 'r') as f:
javascript_contents = f.read()
return HttpResponse(javascript_contents, mimetype="text/javascript")
Again, not sure which solution would be best for your needs, but one of those should get you the result you need.

How to invalidate cache_page in Django?

Here is the problem: I have blog app and I cache the post output view for 5 minutes.
#cache_page(60 * 5)
def article(request, slug):
...
However, I'd like to invalidate the cache whenever a new comment is added to the post.
I'm wondering how best to do so?
I've seen this related question, but it is outdated.
I would cache in a bit different way:
def article(request, slug):
cached_article = cache.get('article_%s' % slug)
if not cached_article:
cached_article = Article.objects.get(slug=slug)
cache.set('article_%s' % slug, cached_article, 60*5)
return render(request, 'article/detail.html', {'article':cached_article})
then saving the new comment to this article object:
# ...
# add the new comment to this article object, then
if cache.get('article_%s' % article.slug):
cache.delete('article_%s' % article.slug)
# ...
This was the first hit for me when searching for a solution, and the current answer wasn't terribly helpful, so after a lot of poking around Django's source, I have an answer for this one.
Yes you can know the key programmatically, but it takes a little work.
Django's page caching works by referencing the request object, specifically the request path and query string. This means that for every request to your page that has a different query string, you will have a different cache key. For most cases, this isn't likely to be a problem, since the page you want to cache/invalidate will be a known string like /blog/my-awesome-year, so to invalidate this, you just need to use Django's RequestFactory:
from django.core.cache import cache
from django.test import RequestFactory
from django.urls import reverse
from django.utils.cache import get_cache_key
cache.delete(get_cache_key(RequestFactory().get("/blog/my-awesome-year")))
If your URLs are a fixed list of values (ie. no differing query strings) then you can stop here. However if you've got lots of different query strings (say ?q=xyz for a search page or something), then your best bet is probably to create a separate cache for each view. Then you can just pass cache="cachename" to cache_page() and later clear that entire cache with:
from django.core.cache import caches
caches["my_cache_name"].clear()
Important note about this tactic
It only really works for unauthenticated pages. The minute your user is logged in, the cookie data is made part of the cache key creation process, and therefore re-creating that key programmatically becomes much harder. I suppose you could try pulling the cookie data out of your session store, but there could be thousands of keys in there, and you'd have to invalidate/pre-cache each and every one of them.

Turn down webpage depending on time of day (DJANGO)

How can i use django templates to remove the contents of a page depending on the time of day?
I'm creating an online food delivery site. The delivery orders are sectioned off into times.
For example for a 7pm delivery drop, i would want the page to show normally until 6pm that day. Then at 6:01pm i would want the page to say something like "This delivery time is not available"
To be honest you shouldn't rely on template logic if you want to prevent unwanted behaviour.
You can create two variables, e.g.
from django.conf import settings
TIME_OPEN = getattr(settings, 'TIME_OPEN', datetime.now().replace(hour=10, minute=30, second=0, microsecond=0).time())
TIME_CLOSED = getattr(settings, 'TIME_CLOSED', datetime.now().replace(hour=21, minute=30, second=0, microsecond=0).time())
In your url patterns you could add something like:
if TIME_OPEN < datetime.now().time() < TIME_CLOSED:
urlpatterns += patterns('shop.customers.views',
(r'^checkout/$', 'checkout'),
)
Based on your new variables you could add a context_processor that supplies a context variable to each template for UI logic, e.g. {'shop_open': True}.
Mind you these examples rely on server time, so you would have to check, because it can differ from your local machine. Another approach could be to create a decorator which can be wrapped around views that require certain times.
So, just to make sure; Don't rely on template logic, protect your views

Decoupling django apps 2 - how to get object information from a slug in the URL

I am trying to de-couple two apps:
Locations - app containing details about some location (town, country, place etc)
Directory - app containing details of places of interest (shop, railway station, pub, etc) - all categorised.
Both locations.Location and directory.Item contain lat/lng coords and I can find items within a certain distance of a specific lat/lng coord.
I'd like to use the following URL structure:
/locations/<location_slug>/directory/<category_slug>/
But I don't want to make my directory app reliant on my location app.
How can I translate this url to use a view like this in my directory app?
items_near(lat, lng, distance, category):
A work around would be to create a new view somewhere that translates this - but where should I put that? if it goes in the Directory app, then I've coupled that with my Location app, and vice versa.
Would a good idea be to place this workaround code inside my project URLs file? Thus keeping clear of both apps? Any issues with doing it like this?
For your urlpattern to work, the view function invoked has to be aware of both Locations and Directories. The short answer is you can put this view function anywhere you want - it's just a python function. But there might be a few logical places for it, outside of your Directory or Location app, that make sense.
First off, I would not put that view code in in your top-level urls.py, as that file is intended for URLconf related code.
A few options of where to put your view:
Create a new view function in a file that lives outside of any particular app. <project_root>/views.py is one possibility. There is nothing wrong with this view calling the item_near(..) view from the Directory app.
# in myproject/urls.py
urlpatterns = (
...
(r'/locations/(?P<location_slug>[\w\-]+)/directory/(?P<category_slug>[\w\-]+)/',
'myproject.views.items_near_from_slug')
)
# in myproject/views.py
from directory.views import items_near
def items_near_from_slug(request, location_slug, category_slug):
location = get_object_or_404(Location, slug=location_slug)
distance = 2 # not sure where this comes from
# And then just invoke the view from your Directory app
return items_near(request, location.lat, location.lng, distance, category_slug)
Create a new app and put the code there, in <my_new_app>/views.py. There is no requirement that a Django app need to have a models.py, urls.py, etc. Just make sure to include the __init__.py if you want Django to load the app properly (if, for instance, you want Django to automatically find templatetags or app specific templates).
Personally, I would go with option 1 only if the project is relatively simple, and <project_root>/views.py is not in danger of becoming cluttered with views for everything. I would go with option 2 otherwise, especially if you anticipate having other code that needs to be aware of both Locations and Directories. With option 2, you can collect the relevant urlpatterns in their own app-specific urls.py as well.
From the django docs here if you're using django >=1.1 then you can pass any captured info into the included application. So splitting across a few files:
# in locations.urls.py
urlpatterns = ('',
(r'location/(?P<location_slug>.*?)/', include('directory.urls')),
#other stuff
)
# in directory.urls.py
urlpatterns = ('',
(r'directory/(?P<directory_slug>.*?)/', 'directory.views.someview'),
#other stuff
)
# in directory.views.py
def someview(request, location_slug = None, directory_slug = None):
#do stuff
Hope that helps. If you're in django < 1.1 I have no idea.
Irrespective of how much ever "re-usable" you make your app, inevitably there is a need for site-specific code.
I think it is logical to create a "site-specific" application that uses the views of the reusable and decoupled apps.

How to generate temporary URLs in Django

Wondering if there is a good way to generate temporary URLs that expire in X days. Would like to email out a URL that the recipient can click to access a part of the site that then is inaccessible via that URL after some time period. No idea how to do this, with Django, or Python, or otherwise.
If you don't expect to get a large response rate, then you should try to store all of the data in the URL itself. This way, you don't need to store anything in the database, and will have data storage proportional to the responses rather than the emails sent.
Updated: Let's say you had two strings that were unique for each user. You can pack them and unpack them with a protecting hash like this:
import hashlib, zlib
import cPickle as pickle
import urllib
my_secret = "michnorts"
def encode_data(data):
"""Turn `data` into a hash and an encoded string, suitable for use with `decode_data`."""
text = zlib.compress(pickle.dumps(data, 0)).encode('base64').replace('\n', '')
m = hashlib.md5(my_secret + text).hexdigest()[:12]
return m, text
def decode_data(hash, enc):
"""The inverse of `encode_data`."""
text = urllib.unquote(enc)
m = hashlib.md5(my_secret + text).hexdigest()[:12]
if m != hash:
raise Exception("Bad hash!")
data = pickle.loads(zlib.decompress(text.decode('base64')))
return data
hash, enc = encode_data(['Hello', 'Goodbye'])
print hash, enc
print decode_data(hash, enc)
This produces:
849e77ae1b3c eJzTyCkw5ApW90jNyclX5yow4koMVnfPz09JqkwFco25EvUAqXwJnA==
['Hello', 'Goodbye']
In your email, include a URL that has both the hash and enc values (properly url-quoted). In your view function, use those two values with decode_data to retrieve the original data.
The zlib.compress may not be that helpful, depending on your data, you can experiment to see what works best for you.
You could set this up with URLs like:
http://yoursite.com/temp/1a5h21j32
Your URLconf would look something like this:
from django.conf.urls.defaults import *
urlpatterns = patterns('',
(r'^temp/(?P<hash>\w+)/$', 'yoursite.views.tempurl'),
)
...where tempurl is a view handler that fetches the appropriate page based on the hash. Or, sends a 404 if the page is expired.
models
class TempUrl(models.Model):
url_hash = models.CharField("Url", blank=False, max_length=32, unique=True)
expires = models.DateTimeField("Expires")
views
def generate_url(request):
# do actions that result creating the object and mailing it
def load_url(request, hash):
url = get_object_or_404(TempUrl, url_hash=hash, expires__gte=datetime.now())
data = get_some_data_or_whatever()
return render_to_response('some_template.html', {'data':data},
context_instance=RequestContext(request))
urls
urlpatterns = patterns('', url(r'^temp/(?P<hash>\w+)/$', 'your.views.load_url', name="url"),)
//of course you need some imports and templates
It depends on what you want to do - one-shot things like account activation or allowing a file to be downloaded could be done with a view which looks up a hash, checks a timestamp and performs an action or provides a file.
More complex stuff such as providing arbitrary data would also require the model containing some reference to that data so that you can decide what to send back. Finally, allowing access to multiple pages would probably involve setting something in the user's session and then using that to determine what they can see, followed by a redirect.
If you could provide more detail about what you're trying to do and how well you know Django, I can make a more specific reply.
I think the solution lies within a combination of all the suggested solutions. I'd suggest using an expiring session so the link will expire within the time period you specify in the model. Combined with a redirect and middleware to check if a session attribute exists and the requested url requires it you can create somewhat secure parts of your site that can have nicer URLs that reference permanent parts of the site. I use this for demonstrating design/features for a limited time. This works to prevent forwarding... I don't do it but you could remove the temp url after first click so only the session attribute will provide access thus more effectively limiting to one user. I personally don't mind if the temp url gets forwarded knowing it will only last for a certain amount of time. Works well in a modified form for tracking invited visits as well.
It might be overkill, but you could use a uuidfield on your model and set up a Celerybeat task to change the uuid at any time interval you choose.
If celery is too much and it might be, you could just store the time the URL is first sent, use the timedelta function whenever it is sent thereafter, and if the elapsed time is greater than what you want just use a redirect. I think the second solution is very straightforward and it would extend easily. It would be a matter of having a model with the URL, time first sent, time most recently sent, a disabled flag, and a Delta that you find acceptable for the URL to live.
A temporary url can also be created by combining the ideas from #ned-batchelder's answer and #matt-howell's answer with Django's signing module.
The signing module provides a convenient way to encode data in the url, if necessary, and to check for link expiration. This way we don't need to touch the database or session/cache.
Here's a minimal example with an index page and a temp page:
The index page has a link to a temporary url, with the specified expiration. If you try to follow the link after expiration, you'll get a status 400 "Bad Request" (or you'll see the SuspiciousOperation error, if DEBUG is True).
urls.py
...
urlpatterns = [
path('', views.index, name='index'),
path('<str:signed_data>/', views.temp, name='temp'),
]
views.py
from django.core import signing
from django.core.exceptions import SuspiciousOperation
from django.http import HttpResponse
from django.urls import reverse
MAX_AGE_SECONDS = 20 # short expiration, for illustrative purposes
def generate_temp_url(data=None):
# signing.dumps() returns a "URL-safe, signed base64 compressed JSON string"
# with a timestamp
return reverse('temp', args=[signing.dumps(data)])
def index(request):
# just a convenient usage example
return HttpResponse(f'temporary link')
def temp(request, signed_data):
try:
# load data and check expiration
data = signing.loads(signed_data, max_age=MAX_AGE_SECONDS)
except signing.BadSignature:
# triggers an HttpResponseBadRequest (status 400) when DEBUG is False
raise SuspiciousOperation('invalid signature')
# success
return HttpResponse(f'Here\'s your data: {data}')
Some notes:
The responses in the example are very rudimentary, and only for illustrative purposes.
Raising a SuspiciousOperation is convenient, but you could e.g. return an HttpResponseNotFound (status 404) instead.
The generate_temp_url() returns a relative path. If you need an absolute url, you can do something like:
temp_url = request.build_absolute_uri(generate_temp_url())
If you're worried about leaking the signed data, have a look at alternatives such as Django's password reset implementation.