How to generate temporary URLs in Django - django

Wondering if there is a good way to generate temporary URLs that expire in X days. Would like to email out a URL that the recipient can click to access a part of the site that then is inaccessible via that URL after some time period. No idea how to do this, with Django, or Python, or otherwise.

If you don't expect to get a large response rate, then you should try to store all of the data in the URL itself. This way, you don't need to store anything in the database, and will have data storage proportional to the responses rather than the emails sent.
Updated: Let's say you had two strings that were unique for each user. You can pack them and unpack them with a protecting hash like this:
import hashlib, zlib
import cPickle as pickle
import urllib
my_secret = "michnorts"
def encode_data(data):
"""Turn `data` into a hash and an encoded string, suitable for use with `decode_data`."""
text = zlib.compress(pickle.dumps(data, 0)).encode('base64').replace('\n', '')
m = hashlib.md5(my_secret + text).hexdigest()[:12]
return m, text
def decode_data(hash, enc):
"""The inverse of `encode_data`."""
text = urllib.unquote(enc)
m = hashlib.md5(my_secret + text).hexdigest()[:12]
if m != hash:
raise Exception("Bad hash!")
data = pickle.loads(zlib.decompress(text.decode('base64')))
return data
hash, enc = encode_data(['Hello', 'Goodbye'])
print hash, enc
print decode_data(hash, enc)
This produces:
849e77ae1b3c eJzTyCkw5ApW90jNyclX5yow4koMVnfPz09JqkwFco25EvUAqXwJnA==
['Hello', 'Goodbye']
In your email, include a URL that has both the hash and enc values (properly url-quoted). In your view function, use those two values with decode_data to retrieve the original data.
The zlib.compress may not be that helpful, depending on your data, you can experiment to see what works best for you.

You could set this up with URLs like:
http://yoursite.com/temp/1a5h21j32
Your URLconf would look something like this:
from django.conf.urls.defaults import *
urlpatterns = patterns('',
(r'^temp/(?P<hash>\w+)/$', 'yoursite.views.tempurl'),
)
...where tempurl is a view handler that fetches the appropriate page based on the hash. Or, sends a 404 if the page is expired.

models
class TempUrl(models.Model):
url_hash = models.CharField("Url", blank=False, max_length=32, unique=True)
expires = models.DateTimeField("Expires")
views
def generate_url(request):
# do actions that result creating the object and mailing it
def load_url(request, hash):
url = get_object_or_404(TempUrl, url_hash=hash, expires__gte=datetime.now())
data = get_some_data_or_whatever()
return render_to_response('some_template.html', {'data':data},
context_instance=RequestContext(request))
urls
urlpatterns = patterns('', url(r'^temp/(?P<hash>\w+)/$', 'your.views.load_url', name="url"),)
//of course you need some imports and templates

It depends on what you want to do - one-shot things like account activation or allowing a file to be downloaded could be done with a view which looks up a hash, checks a timestamp and performs an action or provides a file.
More complex stuff such as providing arbitrary data would also require the model containing some reference to that data so that you can decide what to send back. Finally, allowing access to multiple pages would probably involve setting something in the user's session and then using that to determine what they can see, followed by a redirect.
If you could provide more detail about what you're trying to do and how well you know Django, I can make a more specific reply.

I think the solution lies within a combination of all the suggested solutions. I'd suggest using an expiring session so the link will expire within the time period you specify in the model. Combined with a redirect and middleware to check if a session attribute exists and the requested url requires it you can create somewhat secure parts of your site that can have nicer URLs that reference permanent parts of the site. I use this for demonstrating design/features for a limited time. This works to prevent forwarding... I don't do it but you could remove the temp url after first click so only the session attribute will provide access thus more effectively limiting to one user. I personally don't mind if the temp url gets forwarded knowing it will only last for a certain amount of time. Works well in a modified form for tracking invited visits as well.

It might be overkill, but you could use a uuidfield on your model and set up a Celerybeat task to change the uuid at any time interval you choose.
If celery is too much and it might be, you could just store the time the URL is first sent, use the timedelta function whenever it is sent thereafter, and if the elapsed time is greater than what you want just use a redirect. I think the second solution is very straightforward and it would extend easily. It would be a matter of having a model with the URL, time first sent, time most recently sent, a disabled flag, and a Delta that you find acceptable for the URL to live.

A temporary url can also be created by combining the ideas from #ned-batchelder's answer and #matt-howell's answer with Django's signing module.
The signing module provides a convenient way to encode data in the url, if necessary, and to check for link expiration. This way we don't need to touch the database or session/cache.
Here's a minimal example with an index page and a temp page:
The index page has a link to a temporary url, with the specified expiration. If you try to follow the link after expiration, you'll get a status 400 "Bad Request" (or you'll see the SuspiciousOperation error, if DEBUG is True).
urls.py
...
urlpatterns = [
path('', views.index, name='index'),
path('<str:signed_data>/', views.temp, name='temp'),
]
views.py
from django.core import signing
from django.core.exceptions import SuspiciousOperation
from django.http import HttpResponse
from django.urls import reverse
MAX_AGE_SECONDS = 20 # short expiration, for illustrative purposes
def generate_temp_url(data=None):
# signing.dumps() returns a "URL-safe, signed base64 compressed JSON string"
# with a timestamp
return reverse('temp', args=[signing.dumps(data)])
def index(request):
# just a convenient usage example
return HttpResponse(f'temporary link')
def temp(request, signed_data):
try:
# load data and check expiration
data = signing.loads(signed_data, max_age=MAX_AGE_SECONDS)
except signing.BadSignature:
# triggers an HttpResponseBadRequest (status 400) when DEBUG is False
raise SuspiciousOperation('invalid signature')
# success
return HttpResponse(f'Here\'s your data: {data}')
Some notes:
The responses in the example are very rudimentary, and only for illustrative purposes.
Raising a SuspiciousOperation is convenient, but you could e.g. return an HttpResponseNotFound (status 404) instead.
The generate_temp_url() returns a relative path. If you need an absolute url, you can do something like:
temp_url = request.build_absolute_uri(generate_temp_url())
If you're worried about leaking the signed data, have a look at alternatives such as Django's password reset implementation.

Related

How to invalidate cache_page in Django?

Here is the problem: I have blog app and I cache the post output view for 5 minutes.
#cache_page(60 * 5)
def article(request, slug):
...
However, I'd like to invalidate the cache whenever a new comment is added to the post.
I'm wondering how best to do so?
I've seen this related question, but it is outdated.
I would cache in a bit different way:
def article(request, slug):
cached_article = cache.get('article_%s' % slug)
if not cached_article:
cached_article = Article.objects.get(slug=slug)
cache.set('article_%s' % slug, cached_article, 60*5)
return render(request, 'article/detail.html', {'article':cached_article})
then saving the new comment to this article object:
# ...
# add the new comment to this article object, then
if cache.get('article_%s' % article.slug):
cache.delete('article_%s' % article.slug)
# ...
This was the first hit for me when searching for a solution, and the current answer wasn't terribly helpful, so after a lot of poking around Django's source, I have an answer for this one.
Yes you can know the key programmatically, but it takes a little work.
Django's page caching works by referencing the request object, specifically the request path and query string. This means that for every request to your page that has a different query string, you will have a different cache key. For most cases, this isn't likely to be a problem, since the page you want to cache/invalidate will be a known string like /blog/my-awesome-year, so to invalidate this, you just need to use Django's RequestFactory:
from django.core.cache import cache
from django.test import RequestFactory
from django.urls import reverse
from django.utils.cache import get_cache_key
cache.delete(get_cache_key(RequestFactory().get("/blog/my-awesome-year")))
If your URLs are a fixed list of values (ie. no differing query strings) then you can stop here. However if you've got lots of different query strings (say ?q=xyz for a search page or something), then your best bet is probably to create a separate cache for each view. Then you can just pass cache="cachename" to cache_page() and later clear that entire cache with:
from django.core.cache import caches
caches["my_cache_name"].clear()
Important note about this tactic
It only really works for unauthenticated pages. The minute your user is logged in, the cookie data is made part of the cache key creation process, and therefore re-creating that key programmatically becomes much harder. I suppose you could try pulling the cookie data out of your session store, but there could be thousands of keys in there, and you'd have to invalidate/pre-cache each and every one of them.

Django File-based session doesn't expire

I just realized that my session doesn't expire when I use file-based session engine. Looking at Django code for file-based session, Django doesn't store any expiration information for a session, thus it's never expire unless the session file gets deleted manually.
This looks like a bug to me, as the database-backed session works fine, and I believe regardless of what session back-end developer chooses, they all should behave similarly.
Switching to database-backed session is not an option for me, as I need to store user's session in files.
Can anyone shed some lights?
Is this really a bug?
If yes, how do you suggest me to work around it?
Thanks!
So it looks like you're right. At least in django 1.4, using django.contrib.sessions.backends.file totally ignores SESSION_COOKIE_AGE. I'm not sure whether that's really a bug, or just undocumented.
If you really need this functionality, you can create your own session engine based on the file backend in contrib, but extend it with expiry functionality.
Open django/contrib/sessions/backends/file.py and add the following imports:
import datetime
from django.utils import timezone
Then, add two lines to the load method, so that it appears as below:
def load(self):
session_data = {}
try:
session_file = open(self._key_to_file(), "rb")
if (timezone.now() - datetime.datetime.fromtimestamp(os.path.getmtime(self._key_to_file()))).total_seconds() > settings.SESSION_COOKIE_AGE:
raise IOError
try:
file_data = session_file.read()
# Don't fail if there is no data in the session file.
....
This will actually compare the last modified date on the session file to expire it.
Save this file in your project somewhere and use it as your SESSION_ENGINE instead of 'django.contrib.sessions.backends.file'
You'll also need to enable SESSION_SAVE_EVERY_REQUEST in your settings if you want the session to timeout based on inactivity.
An option would be to use tmpwatch in the directory where you store the sessions
I hit similar issue on Django 3.1. In my case, my program calls the function set_expiry(value) with an integer argument (int data type) before checking session expiry.
Accoring to Django documentation, the data type of argument value to set_expiry() can be int , datetime or timedelta. However for file-based session, expiry check inside load() doesn't work properly only if int argument is passed to set_expiry() beforehand, and such problem doesn't happen to datetime and timedelta argument of set_expiry().
The simple solution (workaround?) is to avoid int argument to set_expiry(value), you can do so by subclassing django.contrib.sessions.backends.file.SessionStore and overriding set_expiry(value) (code sample below), and change parameter SESSION_ENGINE accordingly in settings.py
from datetime import timedelta
from django.contrib.sessions.backends.file import SessionStore as FileSessionStore
class SessionStore(FileSessionStore):
def set_expiry(self, value):
""" force to convert to timedelta format """
if value and isinstance(value, int):
value = timedelta(seconds=value)
super().set_expiry(value=value)
Note:
It's also OK to pass timedelta or datetime to set_expiry(value) , but you will need to handle serialization issue on datetime object.

Django: Passing data to view from url dispatcher without including the data in the url?

I've got my mind set on dynamically creating URLs in Django, based on names stored in database objects. All of these pages should be handled by the same view, but I would like the database object to be passed to the view as a parameter when it is called. Is that possible?
Here is the code I currently have:
places = models.Place.objects.all()
for place in places:
name = place.name.lower()
urlpatterns += patterns('',
url(r'^'+name +'/$', 'misc.views.home', name='places.'+name)
)
Is it possible to pass extra information to the view, without adding more parameters to the URL? Since the URLs are for the root directory, and I still need 404 pages to show on other values, I can't just use a string parameter. Is the solution to give up on trying to add the URLs to root, or is there another solution?
I suppose I could do a lookup on the name itself, since all URLs have to be unique anyway. Is that the only other option?
I think you can pass a dictionary to the view with additional attributes, like this:
url(r'^'+name +'/$', 'misc.views.home', {'place' : place}, name='places.'+name)
And you can change the view to expect this parameter.
That's generally a bad idea since it will query the database for every request, not only requests relevant to that model. A better idea is to come up with the general url composition and use the same view for all of them. You can then retrieve the relevant place inside the view, which will only hit the database when you reach that specific view.
For example:
urlpatterns += patterns('',
url(r'^places/(?P<name>\w+)/$', 'misc.views.home', name='places.view_place')
)
# views.py
def home(request, name):
place = models.Place.objects.get(name__iexact=name)
# Do more stuff here
I realize this is not what you truly asked for, but should provide you with much less headaches.

Django: creating/modifying the request object

I'm trying to build an URL-alias app which allows the user create aliases for existing url in his website.
I'm trying to do this via middleware, where the request.META['PATH_INFO'] is checked against database records of aliases:
try:
src: request.META['PATH_INFO']
alias = Alias.objects.get(src=src)
view = get_view_for_this_path(request)
return view(request)
except Alias.DoesNotExist:
pass
return None
However, for this to work correctly it is of life-importance that (at least) the PATH_INFO is changed to the destination path.
Now there are some snippets which allow the developer to create testing request objects (http://djangosnippets.org/snippets/963/, http://djangosnippets.org/snippets/2231/), but these state that they are intended for testing purposes.
Of course, it could be that these snippets are fit for usage in a live enviroment, but my knowledge about Django request processing is too undeveloped to assess this.
Instead of the approach you're taking, have you considered the Redirects app?
It won't invisibly alias the path /foo/ to return the view bar(), but it will redirect /foo/ to /bar/
(posted as answer because comments do not seem to support linebreaks or other markup)
Thank for the advice, I have the same feeling regarding modifying request attributes. There must be a reason that the Django manual states that they should be considered read only.
I came up with this middleware:
def process_request(self, request):
try:
obj = A.objects.get(src=request.path_info.rstrip('/')) #The alias record.
view, args, kwargs = resolve_to_func(obj.dst + '/') #Modified http://djangosnippets.org/snippets/2262/
request.path = request.path.replace(request.path_info, obj.dst)
request.path_info = obj.dst
request.META['PATH_INFO'] = obj.dst
request.META['ROUTED_FROM'] = obj.src
request.is_routed = True
return view(request, *args, **kwargs)
except A.DoesNotExist: #No alias for this path
request.is_routed = False
except TypeError: #View does not exist.
pass
return None
But, considering the objections against modifying the requests' attributes, wouldn't it be a better solution to just skip that part, and only add the is_routed and ROUTED_TO (instead of routed from) parts?
Code that relies on the original path could then use that key from META.
Doing this using URLConfs is not possible, because this aliasing is aimed at enabling the end-user to configure his own URLs, with the assumption that the end-user has no access to the codebase or does not know how to write his own URLConf.
Though it would be possible to write a function that converts a user-readable-editable file (XML for example) to valid Django urls, it feels that using database records allows a more dynamic generation of aliases (other objects defining their own aliases).
Sorry to necro-post, but I just found this thread while searching for answers. My solution seems simpler. Maybe a) I'm depending on newer django features or b) I'm missing a pitfall.
I encountered this because there is a bot named "Mediapartners-Google" which is asking for pages with url parameters still encoded as from a naive scrape (or double-encoded depending on how you look at it.) i.e. I have 404s in my log from it that look like:
1.2.3.4 - - [12/Nov/2012:21:23:11 -0800] "GET /article/my-slug-name%3Fpage%3D2 HTTP/1.1" 1209 404 "-" "Mediapartners-Google
Normally I'd just ignore a broken bot, but this one I want to appease because it ought to better target our ads (It's google adsense's bot) resulting in better revenue - if it can see our content. Rumor is it doesn't follow redirects so I wanted to find a solution similar to the original Q. I do not want regular clients accessing pages by these broken urls, so I detect the user-agent. Other applications probably won't do that.
I agree a redirect would normally be the right answer.
My (complete?) solution:
from django.http import QueryDict
from django.core.urlresolvers import NoReverseMatch, resolve
class MediapartnersPatch(object):
def process_request(self, request):
# short-circuit asap
if request.META['HTTP_USER_AGENT'] != 'Mediapartners-Google':
return None
idx = request.path.find('?')
if idx == -1:
return None
oldpath = request.path
newpath = oldpath[0:idx]
try:
url = resolve(newpath)
except NoReverseMatch:
return None
request.path = newpath
request.GET = QueryDict(oldpath[idx+1:])
response = url.func(request, *url.args, **url.kwargs)
response['Link'] = '<%s>; rel="canonical"' % (oldpath,)
return response

Capturing URL parameters in request.GET

I am currently defining regular expressions in order to capture parameters in a URL, as described in the tutorial. How do I access parameters from the URL as part the HttpRequest object?
My HttpRequest.GET currently returns an empty QueryDict object.
I'd like to learn how to do this without a library, so I can get to know Django better.
When a URL is like domain/search/?q=haha, you would use request.GET.get('q', '').
q is the parameter you want, and '' is the default value if q isn't found.
However, if you are instead just configuring your URLconf**, then your captures from the regex are passed to the function as arguments (or named arguments).
Such as:
(r'^user/(?P<username>\w{0,50})/$', views.profile_page,),
Then in your views.py you would have
def profile_page(request, username):
# Rest of the method
To clarify camflan's explanation, let's suppose you have
the rule url(regex=r'^user/(?P<username>\w{1,50})/$', view='views.profile_page')
an incoming request for http://domain/user/thaiyoshi/?message=Hi
The URL dispatcher rule will catch parts of the URL path (here "user/thaiyoshi/") and pass them to the view function along with the request object.
The query string (here message=Hi) is parsed and parameters are stored as a QueryDict in request.GET. No further matching or processing for HTTP GET parameters is done.
This view function would use both parts extracted from the URL path and a query parameter:
def profile_page(request, username=None):
user = User.objects.get(username=username)
message = request.GET.get('message')
As a side note, you'll find the request method (in this case "GET", and for submitted forms usually "POST") in request.method. In some cases, it's useful to check that it matches what you're expecting.
Update: When deciding whether to use the URL path or the query parameters for passing information, the following may help:
use the URL path for uniquely identifying resources, e.g. /blog/post/15/ (not /blog/posts/?id=15)
use query parameters for changing the way the resource is displayed, e.g. /blog/post/15/?show_comments=1 or /blog/posts/2008/?sort_by=date&direction=desc
to make human-friendly URLs, avoid using ID numbers and use e.g. dates, categories, and/or slugs: /blog/post/2008/09/30/django-urls/
Using GET
request.GET["id"]
Using POST
request.POST["id"]
Someone would wonder how to set path in file urls.py, such as
domain/search/?q=CA
so that we could invoke query.
The fact is that it is not necessary to set such a route in file urls.py. You need to set just the route in urls.py:
urlpatterns = [
path('domain/search/', views.CityListView.as_view()),
]
And when you input http://servername:port/domain/search/?q=CA. The query part '?q=CA' will be automatically reserved in the hash table which you can reference though
request.GET.get('q', None).
Here is an example (file views.py)
class CityListView(generics.ListAPIView):
serializer_class = CityNameSerializer
def get_queryset(self):
if self.request.method == 'GET':
queryset = City.objects.all()
state_name = self.request.GET.get('q', None)
if state_name is not None:
queryset = queryset.filter(state__name=state_name)
return queryset
In addition, when you write query string in the URL:
http://servername:port/domain/search/?q=CA
Do not wrap query string in quotes. For example,
http://servername:port/domain/search/?q="CA"
def some_view(request, *args, **kwargs):
if kwargs.get('q', None):
# Do something here ..
For situations where you only have the request object you can use request.parser_context['kwargs']['your_param']
You have two common ways to do that in case your URL looks like that:
https://domain/method/?a=x&b=y
Version 1:
If a specific key is mandatory you can use:
key_a = request.GET['a']
This will return a value of a if the key exists and an exception if not.
Version 2:
If your keys are optional:
request.GET.get('a')
You can try that without any argument and this will not crash.
So you can wrap it with try: except: and return HttpResponseBadRequest() in example.
This is a simple way to make your code less complex, without using special exceptions handling.
I would like to share a tip that may save you some time.
If you plan to use something like this in your urls.py file:
url(r'^(?P<username>\w+)/$', views.profile_page,),
Which basically means www.example.com/<username>. Be sure to place it at the end of your URL entries, because otherwise, it is prone to cause conflicts with the URL entries that follow below, i.e. accessing one of them will give you the nice error: User matching query does not exist.
I've just experienced it myself; hope it helps!
These queries are currently done in two ways. If you want to access the query parameters (GET) you can query the following:
http://myserver:port/resource/?status=1
request.query_params.get('status', None) => 1
If you want to access the parameters passed by POST, you need to access this way:
request.data.get('role', None)
Accessing the dictionary (QueryDict) with 'get()', you can set a default value. In the cases above, if 'status' or 'role' are not informed, the values ​​are None.
If you don't know the name of params and want to work with them all, you can use request.GET.keys() or dict(request.GET) functions
This is not exactly what you asked for, but this snippet is helpful for managing query_strings in templates.
If you only have access to the view object, then you can get the parameters defined in the URL path this way:
view.kwargs.get('url_param')
If you only have access to the request object, use the following:
request.resolver_match.kwargs.get('url_param')
Tested on Django 3.
views.py
from rest_framework.response import Response
def update_product(request, pk):
return Response({"pk":pk})
pk means primary_key.
urls.py
from products.views import update_product
from django.urls import path
urlpatterns = [
...,
path('update/products/<int:pk>', update_product)
]
You might as well check request.META dictionary to access many useful things like
PATH_INFO, QUERY_STRING
# for example
request.META['QUERY_STRING']
# or to avoid any exceptions provide a fallback
request.META.get('QUERY_STRING', False)
you said that it returns empty query dict
I think you need to tune your url to accept required or optional args or kwargs
Django got you all the power you need with regrex like:
url(r'^project_config/(?P<product>\w+)/$', views.foo),
more about this at django-optional-url-parameters
This is another alternate solution that can be implemented:
In the URL configuration:
urlpatterns = [path('runreport/<str:queryparams>', views.get)]
In the views:
list2 = queryparams.split("&")
url parameters may be captured by request.query_params
It seems more recommended to use request.query_params. For example,
When a URL is like domain/search/?q=haha, you would use request.query_params.get('q', None)
https://www.django-rest-framework.org/api-guide/requests/
"request.query_params is a more correctly named synonym for request.GET.
For clarity inside your code, we recommend using request.query_params instead of the Django's standard request.GET. Doing so will help keep your codebase more correct and obvious - any HTTP method type may include query parameters, not just GET requests."