Any thoughts on A/B testing in Django based project? [closed] - django

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
We just now started doing the A/B testing for our Django based project. Can I get some information on best practices or useful insights about this A/B testing.
Ideally each new testing page will be differentiated with a single parameter(just like Gmail). mysite.com/?ui=2 should give a different page. So for every view I need to write a decorator to load different templates based on the 'ui' parameter value. And I dont want to hard code any template names in decorators. So how would urls.py url pattern will be?

It's useful to take a step back and abstract what A/B testing is trying to do before diving into the code. What exactly will we need to conduct a test?
A Goal that has a Condition
At least two distinct Paths to meet the Goal's Condition
A system for sending viewers down one of the Paths
A system for recording the Results of the test
With this in mind let's think about implementation.
The Goal
When we think about a Goal on the web usually we mean that a user reaches a certain page or that they complete a specific action, for example successfully registering as a user or getting to the checkout page.
In Django we could model that in a couple of ways - perhaps naively inside a view, calling a function whenever a Goal has been reached:
def checkout(request):
a_b_goal_complete(request)
...
But that doesn't help because we'll have to add that code everywhere we need it - plus if we're using any pluggable apps we'd prefer not to edit their code to add our A/B test.
How can we introduce A/B Goals without directly editing view code? What about a Middleware?
class ABMiddleware:
def process_request(self, request):
if a_b_goal_conditions_met(request):
a_b_goal_complete(request)
That would allow us to track A/B Goals anywhere on the site.
How do we know that a Goal's conditions has been met? For ease of implementation I'll suggest that we know a Goal has had it's conditions met when a user reaches a specific URL path. As a bonus we can measure this without getting our hands dirty inside a view. To go back to our example of registering a user we could say that this goal has been met when the user reaches the URL path:
/registration/complete
So we define a_b_goal_conditions_met:
a_b_goal_conditions_met(request):
return request.path == "/registration/complete":
Paths
When thinking about Paths in Django it's natural to jump to the idea of using different templates. Whether there is another way remains to be explored. In A/B testing you make small differences between two pages and measure the results. Therefore it should be a best practice to define a single base Path template from which all Paths to the Goal should extend.
How should render these templates? A decorator is probably a good start - it's a best practice in Django to include a parameter template_name to your views a decorator could alter this parameter at runtime.
#a_b
def registration(request, extra_context=None, template_name="reg/reg.html"):
...
You could see this decorator either introspecting the wrapped function and modifying the template_name argument or looking up the correct templates from somewhere (like a Model). If we didn't want to add the decorator to every function we could implement this as part of our ABMiddleware:
class ABMiddleware:
...
def process_view(self, request, view_func, view_args, view_kwargs):
if should_do_a_b_test(...) and "template_name" in view_kwargs:
# Modify the template name to one of our Path templates
view_kwargs["template_name"] = get_a_b_path_for_view(view_func)
response = view_func(view_args, view_kwargs)
return response
We'd need also need to add some way to keep track of which views have A/B tests running etc.
A system for sending viewers down a Path
In theory this is easy but there are lot of different implementations so it's not clear which one is best. We know a good system should divide users evenly down the path - Some hash method must be used - Maybe you could use the modulus of memcache counter divided by the number of Paths - maybe there is a better way.
A system for recording the Results of the Test
We need to record how many users went down what Path - we'll also need access to this information when the user reaches the goal (we need to be able to say what Path they came down to met the Condition of the Goal) - we'll use some kind of Model(s) to record the data and either Django Sessions or Cookies to persist the Path information until the user meets the Goal condition.
Closing Thoughts
I've given a lot of pseudo code for implementing A/B testing in Django - the above is by no means a complete solution but a good start towards creating a reusable framework for A/B testing in Django.
For reference you may want to look at Paul Mar's Seven Minute A/Bs on GitHub - it's the ROR version of the above!
http://github.com/paulmars/seven_minute_abs/tree/master
Update
On further reflection and investigation of Google Website Optimizer it's apparent that there are gaping holes in the above logic. By using different templates to represent Paths you break all caching on the view (or if the view is cached it will always serve the same path!). Instead, of using Paths, I would instead steal GWO terminology and use the idea of Combinations - that is one specific part of a template changing - for instance, changing the <h1> tag of a site.
The solution would involve template tags which would render down to JavaScript. When the page is loaded in the browser the JavaScript makes a request to your server which fetches one of the possible Combinations.
This way you can test multiple combinations per page while preserving caching!
Update
There still is room for template switching - say for example you introduce an entirely new homepage and want to test it's performance against the old homepage - you'd still want to use the template switching technique. The thing to keep in mind is your going to have to figure out some way to switch between X number of cached versions of the page. To do this you'd need to override the standard cached middleware to see if their is a A/B test running on the requested URL. Then it could choose the correct cached version to show!!!
Update
Using the ideas described above I've implemented a pluggable app for basic A/B testing Django. You can get it off Github:
http://github.com/johnboxall/django-ab/tree/master

If you use the GET parameters like you suggsted (?ui=2), then you shouldn't have to touch urls.py at all. Your decorator can inspect request.GET['ui'] and find what it needs.
To avoid hardcoding template names, maybe you could wrap the return value from the view function? Instead of returning the output of render_to_response, you could return a tuple of (template_name, context) and let the decorator mangle the template name. How about something like this? WARNING: I haven't tested this code
def ab_test(view):
def wrapped_view(request, *args, **kwargs):
template_name, context = view(request, *args, **kwargs)
if 'ui' in request.GET:
template_name = '%s_%s' % (template_name, request.GET['ui'])
# ie, 'folder/template.html' becomes 'folder/template.html_2'
return render_to_response(template_name, context)
return wrapped_view
This is a really basic example, but I hope it gets the idea across. You could modify several other things about the response, such as adding information to the template context. You could use those context variables to integrate with your site analytics, like Google Analytics, for example.
As a bonus, you could refactor this decorator in the future if you decide to stop using GET parameters and move to something based on cookies, etc.
Update If you already have a lot of views written, and you don't want to modify them all, you could write your own version of render_to_response.
def render_to_response(template_list, dictionary, context_instance, mimetype):
return (template_list, dictionary, context_instance, mimetype)
def ab_test(view):
from django.shortcuts import render_to_response as old_render_to_response
def wrapped_view(request, *args, **kwargs):
template_name, context, context_instance, mimetype = view(request, *args, **kwargs)
if 'ui' in request.GET:
template_name = '%s_%s' % (template_name, request.GET['ui'])
# ie, 'folder/template.html' becomes 'folder/template.html_2'
return old_render_to_response(template_name, context, context_instance=context_instance, mimetype=mimetype)
return wrapped_view
#ab_test
def my_legacy_view(request, param):
return render_to_response('mytemplate.html', {'param': param})

Justin's response is right... I recommend you vote for that one, as he was first. His approach is particularly useful if you have multiple views that need this A/B adjustment.
Note, however, that you don't need a decorator, or alterations to urls.py, if you have just a handful of views. If you left your urls.py file as is...
(r'^foo/', my.view.here),
... you can use request.GET to determine the view variant requested:
def here(request):
variant = request.GET.get('ui', some_default)
If you want to avoid hardcoding template names for the individual A/B/C/etc views, just make them a convention in your template naming scheme (as Justin's approach also recommends):
def here(request):
variant = request.GET.get('ui', some_default)
template_name = 'heretemplates/page%s.html' % variant
try:
return render_to_response(template_name)
except TemplateDoesNotExist:
return render_to_response('oops.html')

A code based on the one by Justin Voss:
def ab_test(force = None):
def _ab_test(view):
def wrapped_view(request, *args, **kwargs):
request, template_name, cont = view(request, *args, **kwargs)
if 'ui' in request.GET:
request.session['ui'] = request.GET['ui']
if 'ui' in request.session:
cont['ui'] = request.session['ui']
else:
if force is None:
cont['ui'] = '0'
else:
return redirect_to(request, force)
return direct_to_template(request, template_name, extra_context = cont)
return wrapped_view
return _ab_test
example function using the code:
#ab_test()
def index1(request):
return (request,'website/index.html', locals())
#ab_test('?ui=33')
def index2(request):
return (request,'website/index.html', locals())
What happens here:
1. The passed UI parameter is stored in the session variable
2. The same template loads every time, but a context variable {{ui}} stores the UI id (you can use it to modify the template)
3. If user enters the page without ?ui=xx then in case of index2 he's redirected to '?ui=33', in case of index1 the UI variable is set to 0.
I use 3 to redirect from the main page to Google Website Optimizer which in turn redirects back to the main page with a proper ?ui parameter.

You can also A/B test using Google Optimize. To do so you'll have to add Google Analytics to your site and then when you create a Google Optimize experiment each user will get a cookie with a different experiment variant (according to the weight for each variant). You can then extract the variant from the cookie and display various versions of your application. You can use the following snippet to extract the variant:
ga_exp = self.request.COOKIES.get("_gaexp")
parts = ga_exp.split(".")
experiments_part = ".".join(parts[2:])
experiments = experiments_part.split("!")
for experiment_str in experiments:
experiment_parts = experiment_str.split(".")
experiment_id = experiment_parts[0]
variation_id = int(experiment_parts[2])
experiment_variations[experiment_id] = variation_id
However there is a django package that integrates well with Google Optimize: https://github.com/adinhodovic/django-google-optimize/.
And here is a blog post on how to use the package and how Google Optimize works: https://hodovi.cc/blog/django-b-testing-google-optimize/.

Related

Does this use of the require_GET decorated add value?

I was reading this page on how to add a favicon to a Django website. In this article, the following code is presented to serve a favicon from the project root:
#require_GET
#cache_control(max_age=60 * 60 * 24, immutable=True, public=True) # one day
def favicon(request: HttpRequest) -> HttpResponse:
if settings.DEBUG:
name = "favicon-debug.png"
else:
name = "favicon.png"
file = (settings.BASE_DIR / "static" / name).open("rb")
return FileResponse(file)
I understand the #require_GET decorator will ensure this "page" (or this case just an image, really) can only be opened using a GET request. But I wonder, given that all we output here is just a static image, is there any value in doing so? Why would this decorator be there?
It's certainly not going to hurt, but it definitely isn't necessary. In your case, your favicon() method only serves one function and will only ever execute in one way. Therefore, the decorator is pretty much entirely superfluous.
The author likely just put it there because they heard it was "best practice". They probably also had been using it for all their other views.
Here's a more complete explanation on the Security Stack Exchange: https://security.stackexchange.com/questions/199776/why-should-someone-block-all-methods-other-than-get-and-post-in-a-restful-applic.

Django syndication framework: prevent appending SITE_ID to the links

According to the documentation here: https://djangobook.com/syndication-feed-framework/
If link doesn’t return the domain, the syndication framework will
insert the domain of the current site, according to your SITE_ID
setting
However, I'm trying to generate a feed of magnet: links. The framework doesn't recognize this and attempts to append the SITE_ID, such that the links end up like this (on localhost):
<link>http://localhost:8000magnet:?xt=...</link>
Is there a way to bypass this?
Here's a way to do it with monkey patching, much cleaner.
I like to create a separate folder "django_patches" for these kinds of things:
myproject/django_patches/__init__.py
from django.contrib.syndication import views
from django.contrib.syndication.views import add_domain
def add_domain_if_we_should(domain, url, secure=False):
if url.startswith('magnet:'):
return url
else:
return add_domain(domain, url, secure=False)
views.add_domain = add_domain_if_we_should
Next, add it to your INSTALLED_APPS so that you can patch the function.
settings.py
INSTALLED_APPS = [
'django_overrides',
...
]
This is a bit gnarly, but here's a potential solution if you don't want to give up on the Django framework:
The problem is that the method add_domain is buried deep in a huge method within syndication framework, and I don't see a clean way to override it. Since this method is used for both the feed URL and the feed items, a monkey patch of add_domain would need to consider this.
Django source:
https://github.com/django/django/blob/master/django/contrib/syndication/views.py#L178
Steps:
1: Subclass the Feed class you're using and do a copy-paste override of the huge method get_feed
2: Modify the line:
link = add_domain(
current_site.domain,
self._get_dynamic_attr('item_link', item),
request.is_secure(),
)
To something like:
link = self._get_dynamic_attr('item_link', item)
I did end up digging through the syndication source code and finding no easy way to override it and did some hacky monkey patching. (Unfortunately I did it before I saw the answers posted here, all of which I assume will work about as well as this one)
Here's how I did it:
def item_link(self, item):
# adding http:// means the internal get_feed won't modify it
return "http://"+item.magnet_link
def get_feed(self, obj, request):
# hacky way to bypass the domain handling
feed = super().get_feed(obj, request)
for item in feed.items:
# strip that http:// we added above
item['link'] = item['link'][7:]
return feed
For future readers, this was as of Django 2.0.1. Hopefully in a future patch they allow support for protocols like magnet.

Django context processors and URL arguments

I have some code that is repeated at the start of my Django views. It basically just adds some variables to the context, but based on the URL argument, e.g.
def someView(request, id):
target = Target.objects.get(id=id)
# name will be added to ctx
name = target.name
(there are more attributes added and other attributes from related models, but this gives the general idea --- There are quite a few lines of repeat code at the start of each view)
I thought I could make my code more DRY by taking advantage of Django's context processors, but it would seem these don't access to the URL arguments?
Is there another way to avoid these repeat lines? Maybe middleware or something else?
You can access the URL parameters via request through the resolver_match attribute. So for instance you can do request.resolver_match.kwargs['id'] to get the ID kwarg.

Is there a django shorcut for required params in the view

I am developing a django app with a large amount of views. But every time I write a view I have to cater the possibility of params being not provided.. I am writing the same code again and again. I was wondering if there was a django shortcut that could do the job and return a standard error message if a param was not provided.. I want to do something like..
#required_params({'get': ['param1', 'param2'], 'post': ['param3', 'param4']})
def my_view(request):
# Do my stuff
A possible solution for this would be to make use of the 'decorator_from_middleware' functionality: https://docs.djangoproject.com/en/dev/ref/utils/#django.utils.decorators.decorator_from_middleware
It would allow you to process the request on a per view basis and assert that the parameters from the requet meet your criteria and if not return a standard error response.

Django: creating/modifying the request object

I'm trying to build an URL-alias app which allows the user create aliases for existing url in his website.
I'm trying to do this via middleware, where the request.META['PATH_INFO'] is checked against database records of aliases:
try:
src: request.META['PATH_INFO']
alias = Alias.objects.get(src=src)
view = get_view_for_this_path(request)
return view(request)
except Alias.DoesNotExist:
pass
return None
However, for this to work correctly it is of life-importance that (at least) the PATH_INFO is changed to the destination path.
Now there are some snippets which allow the developer to create testing request objects (http://djangosnippets.org/snippets/963/, http://djangosnippets.org/snippets/2231/), but these state that they are intended for testing purposes.
Of course, it could be that these snippets are fit for usage in a live enviroment, but my knowledge about Django request processing is too undeveloped to assess this.
Instead of the approach you're taking, have you considered the Redirects app?
It won't invisibly alias the path /foo/ to return the view bar(), but it will redirect /foo/ to /bar/
(posted as answer because comments do not seem to support linebreaks or other markup)
Thank for the advice, I have the same feeling regarding modifying request attributes. There must be a reason that the Django manual states that they should be considered read only.
I came up with this middleware:
def process_request(self, request):
try:
obj = A.objects.get(src=request.path_info.rstrip('/')) #The alias record.
view, args, kwargs = resolve_to_func(obj.dst + '/') #Modified http://djangosnippets.org/snippets/2262/
request.path = request.path.replace(request.path_info, obj.dst)
request.path_info = obj.dst
request.META['PATH_INFO'] = obj.dst
request.META['ROUTED_FROM'] = obj.src
request.is_routed = True
return view(request, *args, **kwargs)
except A.DoesNotExist: #No alias for this path
request.is_routed = False
except TypeError: #View does not exist.
pass
return None
But, considering the objections against modifying the requests' attributes, wouldn't it be a better solution to just skip that part, and only add the is_routed and ROUTED_TO (instead of routed from) parts?
Code that relies on the original path could then use that key from META.
Doing this using URLConfs is not possible, because this aliasing is aimed at enabling the end-user to configure his own URLs, with the assumption that the end-user has no access to the codebase or does not know how to write his own URLConf.
Though it would be possible to write a function that converts a user-readable-editable file (XML for example) to valid Django urls, it feels that using database records allows a more dynamic generation of aliases (other objects defining their own aliases).
Sorry to necro-post, but I just found this thread while searching for answers. My solution seems simpler. Maybe a) I'm depending on newer django features or b) I'm missing a pitfall.
I encountered this because there is a bot named "Mediapartners-Google" which is asking for pages with url parameters still encoded as from a naive scrape (or double-encoded depending on how you look at it.) i.e. I have 404s in my log from it that look like:
1.2.3.4 - - [12/Nov/2012:21:23:11 -0800] "GET /article/my-slug-name%3Fpage%3D2 HTTP/1.1" 1209 404 "-" "Mediapartners-Google
Normally I'd just ignore a broken bot, but this one I want to appease because it ought to better target our ads (It's google adsense's bot) resulting in better revenue - if it can see our content. Rumor is it doesn't follow redirects so I wanted to find a solution similar to the original Q. I do not want regular clients accessing pages by these broken urls, so I detect the user-agent. Other applications probably won't do that.
I agree a redirect would normally be the right answer.
My (complete?) solution:
from django.http import QueryDict
from django.core.urlresolvers import NoReverseMatch, resolve
class MediapartnersPatch(object):
def process_request(self, request):
# short-circuit asap
if request.META['HTTP_USER_AGENT'] != 'Mediapartners-Google':
return None
idx = request.path.find('?')
if idx == -1:
return None
oldpath = request.path
newpath = oldpath[0:idx]
try:
url = resolve(newpath)
except NoReverseMatch:
return None
request.path = newpath
request.GET = QueryDict(oldpath[idx+1:])
response = url.func(request, *url.args, **url.kwargs)
response['Link'] = '<%s>; rel="canonical"' % (oldpath,)
return response