Is this Django Middleware Thread-safe? - django

I am writing forum app on Django using custom session/auth/users/acl system. One of goals is allowing users to browse and use my app even if they have cookies off. Coming from PHP world, best solution for problem is appending sid= to every link on page. Here is how I plan to do it:
Session middleware checks if user has session cookie or remember me cookie. If he does, this most likely means cookies work for him. If he doesnt, we generate new session ID, open new session (make new entry in sessions table in DB), then send cookie and redirect user to where he is, but with SID appended to url. After redirect middleware will see if session id can be obtained from either cookie or GET. If its cookie, we stop adding sid to urls. If its GET, we keep them.
I plan to insert SID= part into url's by decorating django.core.urlresolvers.reverse and reverse_lazy with my own function that appends ?sid= to them. However this raises some problems because both middlewares urlresolvers and are not thread safe. To overcome this I created something like this:
class SessionMiddleware(object):
using_decorator = False
original_reverse = None
def process_request(self, request):
self.using_decorator = True
self.original_reverse = urlresolvers.reverse
urlresolvers.reverse = session_url_decorator(urlresolvers.reverse, 's87add8ash7d6asdgas7dasdfsadas')
def process_response(self, request, response):
# Turn off decorator if we are using it
if self.using_decorator:
urlresolvers.reverse = self.original_reverse
self.using_decorator = False
return response
If SID has to be passed via links, process_request sets using_decorator to true and stores undecorated urlresolvers.revers in separate method. After page is rendered process_response checks using_decorator to see if it has to perform "garbage collection". If it does, it returns reverse function to original undecorated state.
My question is, is this approach thread-safe? Or will increase in traffic on my forum may result in middleware decorating those functions again and again and again, failing to run "garbage collection"? I also tought about using regex to simply skim generated HTML response for links and providing template filters and variables for manually adding SID to places that are omitted by regex.
Which approach is better? Also is current one thread safe?

First of all: Using SIDs in the URL is quite dangerous, eg if you copy&paste a link for a friend he is signed in as you. Since most users don't know what a SID is they will run into this issue. As such you should never ever use SIDs in the url and since Facebook and friends all require cookies you should be fine too...
Considering that, monkeypatching urlresolvers.reverse luckily doesn't work! Might be doable with a custom URLResolvers subclass, but I recommend against it.
And yes, your middleware is not threadsafe. Middlewares are initialized only once and shared between threads, meaning that storing anything on self is not threadsafe.

Related

How to prevent Django from writing to django_session table for certain URLs

Apologies if my question is very similar to this one and my approach to trying to solve the issue is 100% based on the answers to that question but I think this is slightly more involved and may target a part of Django that I do not fully understand.
I have a CMS system written in Django 1.5 with a few APIs accessible by two desktop applications which cannot make use of cookies as a browser does.
I noticed that every time an API call is made by one of the applications (once every 3 seconds), a new entry is added to django_session table. Looking closely at this table and the code, I can see that all entries to a specific URL are given the same session_data value but a different session_key. This is probably because Django determines that when one of these calls is made from a cookie-less application, the request.session._session_key is None.
The result of this is that thousands of entries are created every day in django_session table and simply running ./manage clearsessions using a daily cron will not remove them from this table, making whole database quite large for no obvious benefit. Note that I even tried set_expiry(1) for these requests, but ./manage clearsessions still doesn't get rid of them.
To overcome this problem through Django, I've had to override 3 Django middlewares as I'm using SessionMiddleware, AuthenticationMiddleware and MessageMiddleware:
from django.contrib.sessions.middleware import SessionMiddleware
from django.contrib.auth.middleware import AuthenticationMiddleware
from django.contrib.messages.middleware import MessageMiddleware
class MySessionMiddleware(SessionMiddleware):
def process_request(self, request):
if ignore_these_requests(request):
return
super(MySessionMiddleware, self).process_request(request)
def process_response(self, request, response):
if ignore_these_requests(request):
return response
return super(MySessionMiddleware, self).process_response(request, response)
class MyAuthenticationMiddleware(AuthenticationMiddleware):
def process_request(self, request):
if ignore_these_requests(request):
return
super(MyAuthenticationMiddleware, self).process_request(request)
class MyMessageMiddleware(MessageMiddleware):
def process_request(self, request):
if ignore_these_requests(request):
return
super(MyMessageMiddleware, self).process_request(request)
def ignore_these_requests(request):
if request.POST and request.path.startswith('/api/url1/'):
return True
elif request.path.startswith('/api/url2/'):
return True
return False
Although the above works, I can't stop thinking that I may have made this more complex that it really is and that this is not the most efficient approach as 4 extra checks are made for every single request.
Are there any better ways to do the above in Django? Any suggestions would be greatly appreciated.
Dirty hack: removing session object conditionally.
One approach would be including a single middleware discarding the session object conditional to the request. It's a bit of a dirty hack for two reasons:
The Session object is created at first and removed later. (inefficient)
You're relying on the fact that the Session object isn't written to the database yet at that point. This may change in future Django versions (though not very likely).
Create a custom middleware:
class DiscardSessionForAPIMiddleware(object):
def process_request(self, request):
if request.path.startswith("/api/"): # Or any other condition
del request.session
Make sure you install this after the django.contrib.sessions.middleware.SessionMiddleware in the MIDDLEWARE_CLASSES tuple in your settings.py.
Also check that settings.SESSION_SAVE_EVERY_REQUEST is set to False (the default). This makes it delay the write to the database until the data is modified.
Alternatives (untested)
Use process_view instead of process_request in the custom middleware so you can check for the view instead of the request path. Advantage: condition check is better. Disadvantage: other middleware might already have done something with the session object and then this approach fails.
Create a custom decorator (or a shared base class) for your API views deleting the session object in there. Advantage: responsibility for doing this will be with the views, the place where you probably like it best (view providing the API). Disadvantage: same as above, but deleting the session object in an even later stage.
Make sure your settings.SESSION_SAVE_EVERY_REQUEST is set to False. That will go a long way in ensuring sessions aren't saved every time.
Also, if you have any ajax requests going to your server, ensure that the request includes the cookie information so that the server doesn't think each request belongs to a different person.

Django testing named URLs with additional GET parameters

I am trying to write some tests for a Django application I'm working on but I haven't yet decided on the exact urls I want to use for each view. Therefore, I'm using named urls in the tests.
For example, I have a url named dashboard:
c = Client()
resp = c.get(reverse('dashboard'))
This view should only be available to logged in users. If the current user is anonymous, it should redirect them to the login page, which is also a named url. However, when it does this, it uses an additional GET parameter to keep track of the url it just came from, which results in the following:
/login?next=dashboard
When I then try to test this redirect, it fails because of these additional parameters:
# It's expecting '/login' but gets '/login?next=dashboard'
self.assertRedirects(resp, reverse('login'))
Obviously, it works if I hard code them into the test:
self.assertRedirects(resp, '/login?next=dashboard')
But then, if I ever decide to change the URL for my dashboard view, I'd have to update every test that uses it.
Is there something I can do to make it easier to handle these extra parameters?
Any advice appreciated.
Thanks.
As you can see, reverse(...) returns a string. You can use it as:
self.assertRedirects(resp, '%s?next=dashboard' % reverse('login'))

django cache conditional view processing latest_entry

I have a model called aps
class aps(model.Model):
u=models.ForeignKey(User, related_name='u')
when=models.DateTimeField() #datetime when row was inserted
a=models.ForeignKey(User, related_name='a')
a_read=models.BooleanField()
a_last_read=models.DateTimeField()
The records from aps are retrieved like this:
def displayAps(request, name)
apz=aps.objects.order_by('-when').filter(u=User.objects.get(username=name))
return render_to_response(template.html, {'apz':apz})
And further they are shown in template.html ...
What I am trying to achieve is actually what django docz mention here about conditional view processing
I do something like:
def latest_entry(request, name):
return aps.objects.filter().latest('when').when
#cache_page(60*15)
#last_modified(latest_entry)
def displayAps(request, name)
apz=aps.objects.order_by('-when').filter(u=User.objects.get(username=name))
return render_to_response(template.html, {'apz':apz})
If new rows are added the content is not modified, but retrieved from cache. I need to delete the cache files and 'shift refresh' the browser to see the new rows.
Anyone sees what I am doing wrong here?
If you take out the cache_page decorator, does it work?
The cache_page decorator is the first piece of code that actually gets called in your view function. It is checking the timestamp, and returning the cached data, as it is supposed to. If the cache hasn't expired, the last_modified decorator won't ever be called.
You probably want to be more careful about mixing conditional response processing and static caching, anyway. They accomplish similar things, but using very different mechanisms.
cache_page tells django to only use the view to render an actual response every n seconds. If another request comes in before that, the same rendered content will be returned to the client -- whether it is actually stale or not. This reduces your server load, but doesn't do anything to reduce your bandwidth.
last_modified handles the case where the client says "I have a version of this page that is this old; is it still good?" In that case, your server can check the database, and return a very short "It's still good" response if the database hasn't changed. This cuts down your bandwidth needs significantly for those cases, but you still need to go to the database to determine whether the client's cache is stale or not, so your server load may be almost the same.
Like I mentioned above, you can't just apply cache_page before last_modfied -- if the database has changed, cache_page won't know about it. Worse, if the cache timeout has expired, but the database hasn't changed, then you might end up caching the "304 Not modified" message, and sending that for all subsequent visitors for the next fifteen minutes.
You could apply the decorators in the other order, but you have to make a request from the database for each request, and you could still get into a situation where the database has changed, but the cache hasn't expired -- in that case, the client could still be getting the old version of the page, even though the server has already hit the database to determine that it has been updated.
The filter depends on some logged user or something, you should id this user in cookies and use django's vary_on_cookie.

What/Where to modify django-registration to use with mobile broswers

I am using django-registration along side django auth for my client account creation and login.
Our site will be used by moble users and desktop users. We just started tackling the idea of mobile users by loading different templates from the view depending on user agent strings. It's cleanly done, but I am not sure if it is the right way to do it as we are now stuck with what to do on views that are not easily accessible (that we didn't write ourselves).
Which brings me to the problem at hand:
I have no idea how to tackle redirecting the mobile user away from the login url that django-registration/auth sends them to (the desktop version).
I could change tactics and tackle the different browsers in the template files themselves. That feels like it is going to get messy fast. I don't like that idea at all!
Or I stay with my current method, which is to render the request with different templates based on user agent strings. Then i need to know how I should be dealing with django-registration (how to load a different set of templates based on the user agent string). I would rather not change the django-registration code, if just to make updating modules easier.
The django registration templates are very simple and are used very rarely. I simply handle these as special cases and just come up with a base.html for that works on both platforms reasonably well.
My registration pages look very simple, many sites do this and it is not unexpected.
Another option is to us a middleware which sets the template directory based upon detecting if it is a mobile device. You can detect the mobile browser like this Detect mobile browser (not just iPhone) in python view and then have a middleware that uses the make_tls_property trick to update the TEMPLATE_DIRS something like this:
TEMPLATE_DIRS = settings.__dict__['_wrapped'].__class__.TEMPLATE_DIRS = make_tls_property(settings.TEMPLATE_DIRS)
class MobileMiddleware(object):
"""Sets settings.SITE_ID based on request's domain"""
def process_request(self, request):
if *mobile*:
TEMPLATE_DIRS.value = *mobiletemplates* + settings.BASE_TEMPLATE_DIRS
else:
TEMPLATE_DIRS.value = *normaltemplates* + settings.BASE_TEMPLATE_DIRS
Just to be clear, make_tls_property, which is part of djangotoolbox, makes the TEMPLATE_DIRS setting a per thread variable instead of a global variable so each request response loop gets it's own "version" of the variable.
One method is to simply write your own login view that calls the django-registration view to do the hard work, but passing it a different template depending on the context:
def login(request, *args, **kwargs):
my_kwargs = kwargs.copy()
if <mobile condition>:
my_kwargs['template_name'] = 'my_app/some_template.html'
else:
my_kwargs['template_name'] = 'my_app/some_other_template.html'
from django.contrib import auth
return auth.login(request, *args, **my_kwargs)

django cross-site reverse a url

I have a similar question than django cross-site reverse. But i think I can't apply the same solution.
I'm creating an app that lets the users create their own site. After completing the signup form the user should be redirected to his site's new post form. Something along this lines:
new_post_url = 'http://%s.domain:9292/manage/new_post %site.domain'
logged_user = authenticate(username=user.username, password=user.password)
if logged_user is not None:
login(request, logged_user)
return redirect(new_product_url)
Now, I know that "new_post_url" is awful and makes babies cry so I need to reverse it in some way. I thought in using django.core.urlresolvers.reverse to solve this but that only returns urls on my domain, and not in the user's newly created site, so it doesn't works for me.
So, do you know a better/smarter way to solve this?
It looks like the domain is a subdomain of your own website so what does it matter that you can't reverse that part? Using reverse it doesn't use full domain paths, it gives you the path from the root of the project, so you can simply do something like:
new_post_uri = 'http://%s.domain:9292%s' % (site.domain, reverse('manage_new_post'))
This way you're still using reverse so you're not hardcoding the urls (and making babies cry) and you're not realy having an issue as far as I can see.
Finally, if you do not wish to hardcode your own domain in the code, uses Django's Sites model to get the current site, make sure to modify it from the default example.com to your own domain, so finally your code can be:
current_site = Site.objects.get_current() # See the docs for other options
return redirect('http://%s.%s%s' % (site.domain, current_site, reverse('manage_new_post')))
If you need to get the domain without using the Sites object, your best bet may be request.get_host(), which will get the full domain plus port, but not the protocol.
Hopefully that explains things for you. You can format things a bit better, but that's the gist of it.
redirect also optionally takes a view-name as the argument, So, since you already have all variables needed by it, just pass the view name with all the required arguments, and be done with it, rather than trying out a complicated reverse!
If you still want a reverse nature, may be you should use, get_absolute_url on the model of the Site.