Difference between if and if exists()? - django

Take any given queryset, qs = QS.objects.filter(active=True)
Is there i difference between:
if qs:
and
if qs.exists():
regarding load on the db, etc?

Yes, there's a difference:
if qs will use the __nonzero__ method of the QuerySet object, which calls _fetch_all which will in turn actually execute a full query (that's how I interpret it anyway).
exists() does something more efficient, as noted by Ewan. That's why this method... exists.
So, in short, use exists() when you only need to check for existence since that's what it's for.

From the documentation of exists()
Returns True if the QuerySet contains any results, and False if not. This tries to perform the query in the simplest and fastest way possible, but it does execute nearly the same query as a normal QuerySet query.
exists() is useful for searches relating to both object membership in a QuerySet and to the existence of any objects in a QuerySet, particularly in the context of a large QuerySet.
However they then go on to show a few examples and conclude that if qs vs if qs.exists() needs a large queryset for efficiency gains.
A final caveat from the documentation:
Additionally, if a some_queryset has not yet been evaluated, but you know that it will be at some point, then using some_queryset.exists() will do more overall work (one query for the existence check plus an extra one to later retrieve the results) than simply using bool(some_queryset), which retrieves the results and then checks if any were returned.

It produces the same result. From the help on https://docs.djangoproject.com/en/dev/ref/models/querysets/
exists()
Returns True if the QuerySet contains any results, and False if not.
bool()
Testing a QuerySet in a boolean context, ..., will cause the query to be executed. If there is at least one result, the QuerySet is True, otherwise False.

Related

How to get boolean result in annotate django?

I have a filter which should return a queryset with 2 objects, and should have one different field. for example:
obj_1 = (name='John', age='23', is_fielder=True)
obj_2 = (name='John', age='23', is_fielder=False)
Both the objects are of same model, but different primary key. I tried usign the below filter:
qs = Model.objects.filter(name='John', age='23').annotate(is_fielder=F('plays__outdoor_game_role')=='Fielder')
I used annotate first time, but it gave me the below error:
TypeError: QuerySet.annotate() received non-expression(s): False.
I am new to Django, so what am I doing wrong, and what should be the annotate to get the required objects as shown above?
The solution by #ktowen works well, quite straightforward.
Here is another solution I am using, hope it is helpful too.
queryset = queryset.annotate(is_fielder=ExpressionWrapper(
Q(plays__outdoor_game_role='Fielder'),
output_field=BooleanField(),
),)
Here are some explanations for those who are not familiar with Django ORM:
Annotate make a new column/field on the fly, in this case, is_fielder. This means you do not have a field named is_fielder in your model while you can use it like plays.outdor_game_role.is_fielder after you add this 'annotation'. Annotate is extremely useful and flexible, can be combined with almost every other expression, should be a MUST-KNOWN method in Django ORM.
ExpressionWrapper basically gives you space to wrap a more complecated combination of conditions, use in a format like ExpressionWrapper(expression, output_field). It is useful when you are combining different types of fields or want to specify an output type since Django cannot tell automatically.
Q object is a frequently used expression to specify a condition, I think the most powerful part is that it is possible to chain the conditions:
AND (&): filter(Q(condition1) & Q(condition2))
OR (|): filter(Q(condition1) | Q(condition2))
Negative(~): filter(~Q(condition))
It is possible to use Q with normal conditions like below:
(Q(condition1)|id__in=[list])
The point is Q object must come to the first or it will not work.
Case When(then) can be simply explained as if con1 elif con2 elif con3 .... It is quite powerful and personally, I love to use this to customize an ordering object for a queryset.
For example, you need to return a queryset of watch history items, and those must be in an order of watching by the user. You can do it with for loop to keep the order but this will generate plenty of similar queries. A more elegant way with Case When would be:
item_ids = [list]
ordering = Case(*[When(pk=pk, then=pos)
for pos, pk in enumerate(item_ids)])
watch_history = Item.objects.filter(id__in=item_ids)\
.order_by(ordering)
As you can see, by using Case When(then) it is possible to bind those very concrete relations, which could be considered as 1) a pinpoint/precise condition expression and 2) especially useful in a sequential multiple conditions case.
You can use Case/When with annotate
from django.db.models import Case, BooleanField, Value, When
Model.objects.filter(name='John', age='23').annotate(
is_fielder=Case(
When(plays__outdoor_game_role='Fielder', then=Value(True)),
default=Value(False),
output_field=BooleanField(),
),
)

ObjectDoesNotExist vs. .filter().first() and check for None

In Django 1.6 they introduced .first() to get the first element of a queryset. [Source]
Now there are 2 ways to get a single element:
user_id = 42
try:
obj = User.objects.get(id=user_id)
except ObjectDoesNotExist:
raise Exception("Invalid user id given")
and:
user_id = 42
obj = User.objects.filter(id=user_id).first()
if not obj:
raise Exception("Invalid user id given")
Following the pythonic way to ask for forgiveness, the first one would be more appreciated way to use.
However, the second one might be easier to understand and it is one line shorter.
Q1: Is there any difference in speed between these two code snippets?
Q2: Which one is the preferred way to get a single object?
The two have different semantics and different gaurantees. The main difference is how they handle multiple matching objects.
.get() will raise an exception if multiple objects match the given query. You should therefore use .get() to fetch an item based on a unique property (such as id) or set of properties.
.first() will return the first item, based on the defined ordering, if multiple objects match the given query. Use this to filter on non-unique properties, when you need a single item, the first one based on some (possibly undefined) ordering.
So while .get() guarantees that exactly one item matches the query, .first() only guarantees that it returns the first item based on the given ordering.
How they handle a missing object is more a case of semantics. It is trivial to convert an exception to None or the other way around. While you might save a single line here and there, I wouldn't base my decision to use one over the other on this. The performance difference is negligible as well, and probably depends on the results of the query.

QuerySet subscripting won't work as expected

when got a QuerySet by using filter and it's easy to use the following code to do the change and save operation:
qs = SomeModel.objects.filter(owner_id=123)
# suppose qs has 1 or many elements
last_login_time = qs[0].last_login_time
qs[0].last_login_time = datetime.now() # I expect it can assign the new value, but it won't
assertEquals(qs[0].last_login_time, last_login_time) # YES, it doesn't change
qs[0].save() #So it won't update the old record
And after figuring this out, the following code will be used instead and it works:
qs = SomeModel.objects.filter(owner_id=123)
# suppose qs has 1 or many elements
obj = qs[0]
last_login_time = obj.last_login_time
obj.last_login_time = datetime.now() # I expect it can assign the new value, but it will
assertNotEquals(obj.last_login_time, last_login_time) # YES, it does change
obj.save() #So it will update the old record as expected
And I have met some of my friends/colleagues use the first approach to do the record updating. And IMO, it's natural and prone to use. (when you type qs[0] and type obj , they have the same type)
After reading the code(db.models.query), it can be figured out why.(when you subscript the QuerySet it will use the qs = self._clone() and assigning a value won't change at all)
Possible solutions:
make the assigning work for the
subscripting QuerySet
announce the
above first approach is wrong and
let the users know it
So I want to ask:
Is my question a real issue for django?(I'm wondering why django developer not make it work as expected)
What's your suggestion about this issue? And what's your preferred way for such an issue?
Use update for updating fields in a queryset.
I'm not really sure what you're asking here. Are you saying this is a bug? I don't think so, it's clearly defined behaviour: the queryset is lazy, but is evaluated when you iterate or slice it. Each time you do slice it, you get a new object. This is the logical consequence of the fact that slicing by itself doesn't cause the non-sliced queryset to be evaluated - if the result isn't already cached, slicing will perform a single database call with a LIMIT 1 to only get one result. Otherwise, you're left with extremely undesirable side-effects.
Now, if you think this could be better explained in the docs, you're welcome - and encouraged - to submit a bug with a patch that explains it better.

Checking for empty queryset in Django

What is the recommended idiom for checking whether a query returned any results?
Example:
orgs = Organisation.objects.filter(name__iexact = 'Fjuk inc')
# If any results
# Do this with the results without querying again.
# Else, do something else...
I suppose there are several different ways of checking this, but I'd like to know how an experienced Django user would do it.
Most examples in the docs just ignore the case where nothing was found...
if not orgs:
# The Queryset is empty ...
else:
# The Queryset has results ...
Since version 1.2, Django has QuerySet.exists() method which is the most efficient:
if orgs.exists():
# Do this...
else:
# Do that...
But if you are going to evaluate QuerySet anyway it's better to use:
if orgs:
...
For more information read QuerySet.exists() documentation.
To check the emptiness of a queryset:
if orgs.exists():
# Do something
or you can check for a the first item in a queryset, if it doesn't exist it will return None:
if orgs.first():
# Do something
If you have a huge number of objects, this can (at times) be much faster:
try:
orgs[0]
# If you get here, it exists...
except IndexError:
# Doesn't exist!
On a project I'm working on with a huge database, not orgs is 400+ ms and orgs.count() is 250ms. In my most common use cases (those where there are results), this technique often gets that down to under 20ms. (One case I found, it was 6.)
Could be much longer, of course, depending on how far the database has to look to find a result. Or even faster, if it finds one quickly; YMMV.
EDIT: This will often be slower than orgs.count() if the result isn't found, particularly if the condition you're filtering on is a rare one; as a result, it's particularly useful in view functions where you need to make sure the view exists or throw Http404. (Where, one would hope, people are asking for URLs that exist more often than not.)
The most efficient way (before django 1.2) is this:
if orgs.count() == 0:
# no results
else:
# alrigh! let's continue...
I disagree with the predicate
if not orgs:
It should be
if not orgs.count():
I was having the same issue with a fairly large result set (~150k results). The operator is not overloaded in QuerySet, so the result is actually unpacked as a list before the check is made. In my case execution time went down by three orders.
You could also use this:
if(not(orgs)):
#if orgs is empty
else:
#if orgs is not empty

django - reorder queryset after slicing it

I fetch the latest 5 rows from a Foo model which is ordered by a datetime field.
qs = Foo.objects.all()[:5]
In the following step, I want to reorder the queryset by some other criteria (actually, by the same datetime field in the opposite direction). But reordering after a slice is not permitted. reverse() undoes the first ordering, giving me a differet queryset. Is there a way to accomplish what I want without creating a list from the queryset and doing the ordering using it?
order_by gives you SQL in-database ordering. You're already using that, and then slicing on it. At that point, the results are retrieved into memory. If you want to change their order, you need to use Python in-memory sorting to do it, not the ORM in-database sorting.
In your case, Daniel has already given the best solution: since you simply want to sort by the same field, but in the other order, just reverse the list you have:
qs = Foo.objects.all()[:5]
objs = reversed(qs)
If you had wanted to sort by some other field, then you'd use the sorted() function with a custom key function:
qs = Foo.objects.all()[:5]
objs = sorted(qs, key=lambda o: o.some_other_field)
No, there's no way of doing that. order_by is an operation on the database, but when you slice a queryset it is evaluated and doesn't go back to the database after that.
Sounds like you already know the solution, though: run reversed() on the evaluated qs.
qs = reversed(Foo.objects.all()[:5])
Late answer, but this just worked for me:
import random
sorted(queryset[:10], key=lambda x: random.random())