Django custom prefetch and conditional annotate

Django custom prefetch and conditional annotate - django

I am using custom prefetch object to get only some related objects, ex:
unreleased_prefetch = Prefetch("chants", Chant.objects.with_audio())
teams = Team.objects.public().prefetch_related(unreleased_prefetch)
This works well, but I also want to know count of these objects and filter by these. I am happy that I can at the moment use queryset as parameter to Prefetch object (as I heavily use custom QuerySets/Managers).
Is there way how I can reuse this query, that I pass to Prefetch object same way with conditional annotate?
So far my conditional annotate is quite ugly and looks like this (it does same thing as my original chant with_audio custom query/filter):
.annotate(
unreleased_count=Count(Case(
When(chants__has_audio_versions=True, chants__has_audio=True, chants__flag_reject=False,
chants__active=False, then=1),
output_field=IntegerField()))
).filter(unreleased_count__gt=0)
It works, but is quite ugly and has duplicated logic.
Is there way to pass queryset to When same way I can pass it to prefetch to avoid duplications?

Not saying this is the best practice or anything, but wanted to provide a potential way of dealing with such a situation.
Let's say you have a ChantQuerySet class:
class ChantQuerySet(models.QuerySet):
def with_audio(self):
return self.filter(chants__has_audio_versions=True, chants__has_audio=True,
chants__flag_reject=False, chants__active=False)
Which you use as a manager doing something like below, probably:
class Chant(models.Model):
# ...
objects = ChantQuerySet.as_manager()
I would suggest storing the filter in the QuerySet:
from django.db.models import Q
class ChantQuerySet(models.QuerySet):
#property
def with_audio_filter(self):
return Q(chants__has_audio_versions=True, chants__has_audio=True,
chants__flag_reject=False, chants__active=False)
def with_audio(self):
return self.filter(self.with_audio_filter)
This gives you the ability to do this:
Chant.objects.annotate(
unreleased_count=Count(Case(
When(ChantQuerySet.with_audio_filter, then=1),
output_field=IntegerField()))
).filter(unreleased_count__gt=0)
Now you are able to change the filter only in one place, should you need to do so, without having to change it everywhere. To me it makes sense to store this filter in the QuerySet and personally I see nothing wrong with that, but that's just me.
One thing that I'd change though, is to either make the with_audio_filter property cached, or store it in a field in the constructor when initializing ChantQuerySet.

Related

How to get boolean result in annotate django?

I have a filter which should return a queryset with 2 objects, and should have one different field. for example:
obj_1 = (name='John', age='23', is_fielder=True)
obj_2 = (name='John', age='23', is_fielder=False)
Both the objects are of same model, but different primary key. I tried usign the below filter:
qs = Model.objects.filter(name='John', age='23').annotate(is_fielder=F('plays__outdoor_game_role')=='Fielder')
I used annotate first time, but it gave me the below error:
TypeError: QuerySet.annotate() received non-expression(s): False.
I am new to Django, so what am I doing wrong, and what should be the annotate to get the required objects as shown above?

The solution by #ktowen works well, quite straightforward.
Here is another solution I am using, hope it is helpful too.
queryset = queryset.annotate(is_fielder=ExpressionWrapper(
Q(plays__outdoor_game_role='Fielder'),
output_field=BooleanField(),
),)
Here are some explanations for those who are not familiar with Django ORM:
Annotate make a new column/field on the fly, in this case, is_fielder. This means you do not have a field named is_fielder in your model while you can use it like plays.outdor_game_role.is_fielder after you add this 'annotation'. Annotate is extremely useful and flexible, can be combined with almost every other expression, should be a MUST-KNOWN method in Django ORM.
ExpressionWrapper basically gives you space to wrap a more complecated combination of conditions, use in a format like ExpressionWrapper(expression, output_field). It is useful when you are combining different types of fields or want to specify an output type since Django cannot tell automatically.
Q object is a frequently used expression to specify a condition, I think the most powerful part is that it is possible to chain the conditions:
AND (&): filter(Q(condition1) & Q(condition2))
OR (|): filter(Q(condition1) | Q(condition2))
Negative(~): filter(~Q(condition))
It is possible to use Q with normal conditions like below:
(Q(condition1)|id__in=[list])
The point is Q object must come to the first or it will not work.
Case When(then) can be simply explained as if con1 elif con2 elif con3 .... It is quite powerful and personally, I love to use this to customize an ordering object for a queryset.
For example, you need to return a queryset of watch history items, and those must be in an order of watching by the user. You can do it with for loop to keep the order but this will generate plenty of similar queries. A more elegant way with Case When would be:
item_ids = [list]
ordering = Case(*[When(pk=pk, then=pos)
for pos, pk in enumerate(item_ids)])
watch_history = Item.objects.filter(id__in=item_ids)\
.order_by(ordering)
As you can see, by using Case When(then) it is possible to bind those very concrete relations, which could be considered as 1) a pinpoint/precise condition expression and 2) especially useful in a sequential multiple conditions case.

You can use Case/When with annotate
from django.db.models import Case, BooleanField, Value, When
Model.objects.filter(name='John', age='23').annotate(
is_fielder=Case(
When(plays__outdoor_game_role='Fielder', then=Value(True)),
default=Value(False),
output_field=BooleanField(),
),
)

Why .filter() in django returns duplicated objects?

I've followed django tutorial and arrived at tutorial05.
I tried to not show empty poll as tutorial says, so I added filter condition like this:
class IndexView(generic.ListView):
...
def get_queryset(self):
return Question.objects.filter(
pub_date__lte=timezone.now(),
choice__isnull=False
).order_by('-pub_date')[:5]
But this returned two objects which are exactly same.
I think choice__isnull=False caused the problem, but not sure.

choice__isnull causes the problem. It leads to join with choice table (to weed out questions without choices), that is something like this:
SELECT question.*
FROM question
JOIN choice
ON question.id = choice.question_id
WHERE question.pub_date < NOW()
You can inspect query attribute of QuerySet to be sure. So if you have one question with two choices, you will get that question two times. You need to use distinct() method in this case: queryset.distinct().

Just use .distinct() at the end of your ORM.

A little late to the party, but I figured it could help others looking up the same issue.
Instead of using choice__isnull=False with the filter() method, use it with exclude() instead to exclude out any questions without any choices. So your code would look something like this:
...
def get_queryset(self):
return Question.objects.filter(pub_date__lte=timezone.now()).exclude(choice__isnull=True).order_by('-pub_date')[:5]
By doing it this way, it will return only one instance of the question. Be sure to use choice_isnull=True though.

Because you created two objects with same properties. If you want to ensure uniqueness, you should add validation in clean and add unique index on identifier field too.
Besides filter returns all the objects that match the criteria, if you are expecting only one item to be returned, you should use get instead. get would raise exception if less or more than 1 item is found.

Django: Add arbitrary additional data to a queryset

I am trying to display a map of my data based on a search. The easiest way to handle the map display would be to serialized the queryset generated by the search, and indeed this works just fine using . However, I'd really like to allow for multiple searches, with the displayed points being shown in a user chosen color. The user chosen color, obviously cannot come from the database, since it is not a property of these objects, so none of the aggregators make sense here.
I have tried simply making a utility class, since what I really need is a somewhat complex join between two model classes that then gets serialized into geojson. However, once I created that utility class, it became evident that I lost a lot of the benefits of having a queryset, especially the ability to easily serialize the data with django-geojson (or natively once I can get 1.8 to run smoothly).
Basically, I want to be able to do something like:
querySet = datumClass.objects.filter(...user submitted search parameters...).annotate(color='blue')
Is this possible at all? It seems like this would be more elegant and would work better than my current solution of a non-model utility class which has some serious serialization issues when I try to use python-geojson to serialize.

The problem is that extra comes with all sorts of warning about usefulness or deprecation... But this works:
.extra(select={'color': "'blue'"})
Notice the double quotes wrapping the string value.
This translates to:
SELECT ('blue') AS "color"

Not quite sure what you are trying to achieve, but you can add extra attributes to your objects iterating over the queryset in the view. These can be accessed from the template.
for object in queryset :
if object.contition = 'a'
object.color = 'blue'
else:
object.color = 'green'

if you have a dictionary that maps fields to values, you can do things like
filter_dictionary = {
'date__lte' : '2014-03-01'
}
qs = DatumClass.objects.filter(**filter_dictionary)
And qs would have all dates less than that date (if it has a date field). So, as a user, I could submit any key, value pairs that you could place in your dictionary.

Appending to QuerySet used as a __in lookup parameter

I'd like to retrieve all posts authored by a user OR by his/her friends:
Currently, I use Q objects:
# friends = QuerySet containing User objects
# user = User object
posts = Post.objects.filter(Q(author__in=friends) | Q(author=user))
Which generates a SQL query that looks like this:
SELECT ... WHERE ("posts_post"."author_id" IN (2, 3) OR "posts_post"."author_id" = 1)
Question:
Is it possible to append the user object to the friends QuerySet to generate a query that looks like this, instead:
SELECT ... WHERE "posts_post"."author_id" IN (2, 3, 1)
Directly using the QuerySet's .append() method does in fact work:
friends.append(user)
posts = Post.objects.filter(author__in=friends)
However, I've seen a number of answers here and elsewhere cautioning against treating QuerySets as basic lists.
Is this latter .append() technique safe? Is it efficient, particularly if the QuerySet is fairly large? Or is there another preferred method? Alternatively, feel free to tell me that I'm being silly and that there's nothing wrong with the Q objects approach!
Many thanks.

Querysets don't have an append method, so if that's working in your context friends is already a list rather than a queryset.
As for performance - my gut feeling is that a queryset that's too large to pull into memory so you can append to it is also going to be too large to do well as a subquery. But as always with performance questions, testing is the only real way to be sure.
You'll definitely take something of a performance hit either way. If you don't need friends for anything else, you could use a values_list queryset to get just the PKs into memory, append user.id, then filter on that list of PKs.

You can use this trick:
friends = Friend.objects.values_list('id', flat=True).order_by('id')
friends.append(user.pk)
posts = Post.objects.filter(author__in=friends)
As far as you save only id's, not the whole queryset this method is pretty safe.

Django most efficient way to do this?

I have developed a few Django apps, all pretty straight-forward in terms of how I am interacting with the models.
I am building one now that has several different views which, for lack of a better term, are "canned" search result pages. These pages all return results from the same model, but they are filtered on different columns. One page we might be filtering on type, another we might be filtering on type and size, and on yet another we may be filtering on size only, etc...
I have written a function in views.py which is used by each of these pages, it takes a kwargs and in that are the criteria upon which to search. The minimum is one filter but one of the views has up to 4.
I am simply seeing if the kwargs dict contains one of the filter types, if so I filter the result on that value (I just wrote this code now, I apologize if any errors, but you should get the point):
def get_search_object(**kwargs):
q = Entry.objects.all()
if kwargs.__contains__('the_key1'):
q = q.filter(column1=kwargs['the_key1'])
if kwargs.__contains__('the_key2'):
q = q.filter(column2=kwargs['the_key2'])
return q.distinct()
Now, according to the django docs (http://docs.djangoproject.com/en/dev/topics/db/queries/#id3), these is fine, in that the DB will not be hit until the set is evaluated, lately though I have heard that this is not the most efficient way to do it and one should probably use Q objects instead.
I guess I am looking for an answer from other developers out there. My way currently works fine, if my way is totally wrong from a resources POV, then I will change ASAP.
Thanks in advance

Resource-wise, you're fine, but there are a lot of ways it can be stylistically improved to avoid using the double-underscore methods and to make it more flexible and easier to maintain.
If the kwargs being used are the actual column names then you should be able to pretty easily simplify it since what you're kind of doing is deconstructing the kwargs and rebuilding it manually but for only specific keywords.
def get_search_object(**kwargs):
entries = Entry.objects.filter(**kwargs)
return entries.distinct()
The main difference there is that it doesn't enforce that the keys be actual columns and pretty badly needs some exception handling in there. If you want to restrict it to a specific set of fields, you can specify that list and then build up a dict with the valid entries.
def get_search_object(**kwargs):
valid_fields = ['the_key1', 'the_key2']
filter_dict = {}
for key in kwargs:
if key in valid_fields:
filter_dict[key] = kwargs[key]
entries = Entry.objects.filter(**filter_dict)
return entries.distinct()
If you want a fancier solution that just checks that it's a valid field on that model, you can (ab)use _meta:
def get_search_object(**kwargs):
valid_fields = [field.name for field in Entry._meta.fields]
filter_dict = {}
for key in kwargs:
if key in valid_fields:
filter_dict[key] = kwargs[key]
entries = Entry.objects.filter(**filter_dict)
return entries.distinct()

In this case, your usage is fine from an efficiency standpoint. You would only need to use Q objects if you needed to OR your filters instead of AND.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django custom prefetch and conditional annotate - django

Related

How to get boolean result in annotate django?

Why .filter() in django returns duplicated objects?

Django: Add arbitrary additional data to a queryset

Appending to QuerySet used as a __in lookup parameter

Django most efficient way to do this?

Categories

Resources