Django most efficient way to do this? - django

I have developed a few Django apps, all pretty straight-forward in terms of how I am interacting with the models.
I am building one now that has several different views which, for lack of a better term, are "canned" search result pages. These pages all return results from the same model, but they are filtered on different columns. One page we might be filtering on type, another we might be filtering on type and size, and on yet another we may be filtering on size only, etc...
I have written a function in views.py which is used by each of these pages, it takes a kwargs and in that are the criteria upon which to search. The minimum is one filter but one of the views has up to 4.
I am simply seeing if the kwargs dict contains one of the filter types, if so I filter the result on that value (I just wrote this code now, I apologize if any errors, but you should get the point):
def get_search_object(**kwargs):
q = Entry.objects.all()
if kwargs.__contains__('the_key1'):
q = q.filter(column1=kwargs['the_key1'])
if kwargs.__contains__('the_key2'):
q = q.filter(column2=kwargs['the_key2'])
return q.distinct()
Now, according to the django docs (http://docs.djangoproject.com/en/dev/topics/db/queries/#id3), these is fine, in that the DB will not be hit until the set is evaluated, lately though I have heard that this is not the most efficient way to do it and one should probably use Q objects instead.
I guess I am looking for an answer from other developers out there. My way currently works fine, if my way is totally wrong from a resources POV, then I will change ASAP.
Thanks in advance

Resource-wise, you're fine, but there are a lot of ways it can be stylistically improved to avoid using the double-underscore methods and to make it more flexible and easier to maintain.
If the kwargs being used are the actual column names then you should be able to pretty easily simplify it since what you're kind of doing is deconstructing the kwargs and rebuilding it manually but for only specific keywords.
def get_search_object(**kwargs):
entries = Entry.objects.filter(**kwargs)
return entries.distinct()
The main difference there is that it doesn't enforce that the keys be actual columns and pretty badly needs some exception handling in there. If you want to restrict it to a specific set of fields, you can specify that list and then build up a dict with the valid entries.
def get_search_object(**kwargs):
valid_fields = ['the_key1', 'the_key2']
filter_dict = {}
for key in kwargs:
if key in valid_fields:
filter_dict[key] = kwargs[key]
entries = Entry.objects.filter(**filter_dict)
return entries.distinct()
If you want a fancier solution that just checks that it's a valid field on that model, you can (ab)use _meta:
def get_search_object(**kwargs):
valid_fields = [field.name for field in Entry._meta.fields]
filter_dict = {}
for key in kwargs:
if key in valid_fields:
filter_dict[key] = kwargs[key]
entries = Entry.objects.filter(**filter_dict)
return entries.distinct()

In this case, your usage is fine from an efficiency standpoint. You would only need to use Q objects if you needed to OR your filters instead of AND.

Related

Ordering dictionary in context Django

I looked at this thread, with an example of sorting dictionaries.
I have a dictionary of programme objects where the key is a programme object and the value is a lookup of the number of related Project objects.
def DepartmentDetail(request, pk):
department = Department.objects.get(pk=pk)
programmes = Programme.objects.all().filter(department=department).exclude(active=False).order_by('long_name')
combi = {}
for p in programmes:
prj = Project.objects.all().filter(programme=p)
combi[p] = str(len(prj))
return render(request, 'sysadmin/department.html',{'department': department, 'programmes': programmes, 'combi': sorted(combi.items())})
In the model, Programme returns a string 'long_name', so I believe that I am trying to sort a string key and a string value.
In the template I get to the keys and values as so,
{% for programme, n in combi %}
This gives me the error..
unorderable types: Programme() < Programme()
I don't really understand the error, in the python 3 documentation it states that the sorted() method accepts any iterable - So why does this happen?
I'm looking at collections.OrderedDict to solve the problem, but I want to know why this doesn't work.
Thanx.
Databases with index on columns are really good at sorting. There's almost never a need to sort on the client side. You can almost always do it in the server. The funny part it you apparently know how to do it too.
....exclude(active=False).order_by('long_name') # <--- this
Guess what, your data is already sorted there isn't a need to sort it again inside python!!
But there is a much bigger issue in your code. You are fetching a set of Project items and then looping through that set to retrieve them all over again one by one. So if you have 200 Project items you are doing 200 queries when one query does the job just as well. Just add select_related or prefetch_related depending on which direction you have the relationship.
Your code ideally should be something like this
department = Department.objects.get(pk=pk)
programmes = Programme.objects.all().filter(department=department).exclude(active=False).order_by('long_name')
return render(request, 'sysadmin/department.html',{'department': department, 'programmes': programmes,})
As far as I can see combi just contains duplicated data. The same thing can be accessed from programmes eg. programme.project_set.all()
(again this depends on which direction you have the relationship, your models are not shown)
Recommended reading: https://docs.djangoproject.com/en/1.10/ref/models/fields/#django.db.models.ForeignKey
The issue is that sorted expects a way to be able to order the items and by default there isn't any way to know how to order your objects. You can provide a key
sorted(combi.items(), key=lambda i: i.long_name)

Django: Add arbitrary additional data to a queryset

I am trying to display a map of my data based on a search. The easiest way to handle the map display would be to serialized the queryset generated by the search, and indeed this works just fine using . However, I'd really like to allow for multiple searches, with the displayed points being shown in a user chosen color. The user chosen color, obviously cannot come from the database, since it is not a property of these objects, so none of the aggregators make sense here.
I have tried simply making a utility class, since what I really need is a somewhat complex join between two model classes that then gets serialized into geojson. However, once I created that utility class, it became evident that I lost a lot of the benefits of having a queryset, especially the ability to easily serialize the data with django-geojson (or natively once I can get 1.8 to run smoothly).
Basically, I want to be able to do something like:
querySet = datumClass.objects.filter(...user submitted search parameters...).annotate(color='blue')
Is this possible at all? It seems like this would be more elegant and would work better than my current solution of a non-model utility class which has some serious serialization issues when I try to use python-geojson to serialize.
The problem is that extra comes with all sorts of warning about usefulness or deprecation... But this works:
.extra(select={'color': "'blue'"})
Notice the double quotes wrapping the string value.
This translates to:
SELECT ('blue') AS "color"
Not quite sure what you are trying to achieve, but you can add extra attributes to your objects iterating over the queryset in the view. These can be accessed from the template.
for object in queryset :
if object.contition = 'a'
object.color = 'blue'
else:
object.color = 'green'
if you have a dictionary that maps fields to values, you can do things like
filter_dictionary = {
'date__lte' : '2014-03-01'
}
qs = DatumClass.objects.filter(**filter_dictionary)
And qs would have all dates less than that date (if it has a date field). So, as a user, I could submit any key, value pairs that you could place in your dictionary.

Django views - optimum query set in a ForeignKey model

Having the model:
class Notebook(models.Model):
n_id = models.AutoField(primary_key = True)
class Note(models.Model):
b_nbook = models.ForeignKey(Notebook)
the URL pattern passing one parameter:
(r'^(?P<n_id>\d+)/$', 'notebook_notes')
and the following view:
def notebook_notes(request, n_id):
nbook = get_object_or_404(Nbook, pk=n_id)
...
which of the following is the optimum query set, and why? (they both work and pass the notes based to a selected by URL notebook)
notes = nbook.note_set.filter(b_nbook = n_id)
notes = Note.objects.select_related().filter(b_nbook = n_id)
Well you're comparing apples and oranges a bit there. They may return virtually the same, but you're doing different things on both.
Let's take the relational version first. That query is saying get all the notes that belong to nbook. You're then filtering that queryset by only notes that belong to nbook. You're filtering it twice on the same criteria, in effect. Since Django's querysets are lazy, it doesn't really do anything bad, like hit the database multiple times, but it's still unnecessary.
Now, the second version. Here, you're starting with all notes and filtering to just those that belong to the particular notebook. There's only one filter this time, but it's bad form to do it this way. Since it's a relation, you should look it up through the relational format, i.e. nbook.note_set.all(). On this version, though, you're also using select_related(), which wasn't used on the other version.
select_related will attempt to create a join table with any other relations on the model, in this case a Note. However, since the only relation on Note is Notebook and you already have the notebook, it's redundant.
Taking out all the redundancy in those two version leaves you with just:
notes = nbook.note_set.all()
That, too, will return exactly the same results as the other two version, but is much cleaner and standardized.

intelligent methodology for filtering server side

My lack of CS and inexperience is really coming to the forefront in this moment. I've never really handled filtering results server side. I'm thinking that this is not the right way to go about it. I'm using Django....
First, I assume that I can keep it DRYer by keeping this validation in my form definitions. Next, I was concerned about my chained filter statements. How important is it to use Q complex lookups as opposed to chaining filters at this point? I'm just building a prototype and I assume that I'll eventually have to go for a search solution more powerful than full text search.
My big issue right now (besides the length of the code and clearly the inefficiency) is that I'm not sure how to handle my rooms and workers inputs, which are select forms. If the user does not select a value, I want to remove these filters from the process server side. Should I just create two separate conditional series of lookups for these outcomes?
def search(request):
if request.method=='GET' and request.GET.get('region',''):
neighborhoods=request.GET.getlist('region')
min_rent=request.GET.get('min_cost','0')
min_rent=re.sub(r'[,]','',min_cost) #remove any ','
if re.search(r'[^\d]',min_cost):
min_cost=0
else:
min_cost=int(min_cost)
max_cost=request.GET.get('max_cost','0')
max_cost=re.sub(r'[,]','',max_cost) #remove any ','
if re.search(r'[^\d]',max_cost):
max_cost=100000
else:
max_cost=int(max_rent)
date_min=request.GET.get('from','')
date_max=request.GET.get('to','')
if not date_min:
date=(str(datetime.date.today()))
date_min=u'%s' %date
if not date_max:
date_max=u'2013-03-18'
rooms=request.GET.get('rooms',0)
if not rooms:
rooms=0
workers=request.GET.get('workers',0)
if not workers:
workers=0
#I should probably use Q objects here for complex lookups
posts=Post.objects.filter(region__in=region).filter(cost__gt=min_cost).filter(cost__lt=max_cost).filter(availability__gt=date_min).filter(availability__lt=date_max).filter(rooms=rooms).filter(workers=workers)
#return HttpResponse('%s' %posts)
return render_to_response("website/search.html",{'posts':posts),context_instance=RequestContext(request))
First, I assume that I can keep it
DRYer by keeping this validation in my
form definitions.
Yes, I'd put this in a form as it looks like you are using one to display the form anyways? Also, you can put a lot of your date formatting stuff right in the clean_FIELD methods to format the data in the cleaned_data dict. The only issue here is that output is actually modified so your users will see the change from 1,000 to 1000. Either way, I would put this logic in a form method.
# makes the view clean.
if form.is_valid():
form.get_posts(request)
return response
My big issue right now (besides the
length of the code and clearly the
inefficiency) is that I'm not sure how
to handle my rooms and workers inputs,
which are select forms. If the user
does not select a value, I want to
remove these filters from the process
server side. Should I just create two
separate conditional series of lookups
for these outcomes?
Q objects are only for complex lookups. I don't see a need for them here.
I also don't see why you need to chain the filters. I at first wondered if these are m2m, but these types of queries (__gt/__lt) don't behave any differently chaining as there is no overlap between the queries.
# this is more readable / concise.
# I'd combine as many of your queries as you can just for readability.
posts = Posts.objects.filter(
region__in=region,
cost__gte=min_cost,
# etc
)
Now, if you want optional arguments, my suggestion is to use a dictionary of keyword arguments so that you can dynamically populate the kwargs.
keyword_arguments = {
'region__in': region,
'cost__gte': min_cost,
'cost__lt': max_cost,
'availability__gt': date_min,
'availability__lt': date_max,
}
if request.GET.get('rooms'):
keyword_arguments['rooms'] = request.GET['rooms']
if request.GET.get('workers'):
keyword_arguments['workers'] = request.GET['workers']
posts = Posts.objects.filter(**keyword_arguments)

Django Newbie here... Is there a good way of handling empty MultipleChoiceField results (like a List-Comp format?)

I came across this blog entry which describes an elegant way of handling results returned from a ChoiceField to a view using a list comprehension technique - one that eliminates empty keys/values without all the intermediate data structures. This particular approach doesn't seem to work for MultipeChoiceFields, though. Is there a similar way one might approach those? (If, for example, the bedroom and bathrooms fields in the following example returned multiple values).
The code is as follows:
if search_form.is_valid():
searchdict = search_form.cleaned_data
# It's easier to store a dict of the possible lookups we want, where
# the values are the keyword arguments for the actual query.
qdict = { 'city': 'city__icontains',
'zip_code': 'zip_code',
'property_type': 'property_type_code',
'county': 'county__icontains',
'minimum_price': 'sale_price__gte',
'maximum_price': 'sale_price__lte',
'bedrooms': 'bedrooms__gte',
'bathrooms': 'baths_total__gte'}
# Then we can do this all in one step instead of needing to call
# 'filter' and deal with intermediate data structures.
q_objs = [Q(**{qdict[k]: searchdict[k]}) for k in qdict.keys() if searchdict.get(k, None)]
Thank you very much...this is an awesome community.
Hmm... tricky one, the problem is that you only want that specific case to work as multivalued so you can't use the "normal" approach of adding the filters.
Atleast... I'm assuming that if someone selects multiple bathrooms that it would be either instead of both.
This is too much for a one liner I think, but it can work ;)
import operator
create_qs = lambda k, vs: reduce(operator.or_, [Q(**{k: v}) for v in vs])
q_objs = [create_qs(k, searchdict.getlist(k)) for k in qdict.keys() if k in searchdict]
You might be looking for the IN field lookup like this:
'bathrooms': 'baths_total__in',