Django filter vs exclude - django

Is there a difference between filter and exclude in django? If I have
self.get_query_set().filter(modelField=x)
and I want to add another criteria, is there a meaningful difference between to following two lines of code?
self.get_query_set().filter(user__isnull=False, modelField=x)
self.get_query_set().filter(modelField=x).exclude(user__isnull=True)
is one considered better practice or are they the same in both function and performance?

Both are lazily evaluated, so I would expect them to perform equivalently. The SQL is likely different, but with no real distinction.

It depends what you want to achieve. With boolean values it is easy to switch between .exclude() and .filter() but what about e.g. if you want to get all articles except those from March? You can write the query as
Posts.objects.exclude(date__month=3)
With .filter() it would be (but I not sure whether this actually works):
Posts.objects.filter(date__month__in=[1,2,4,5,6,7,8,9,10,11,12])
or you would have to use a Q object.
As the function name already suggest, .exclude() is used to exclude datasets from the resultset. For boolean values you can easily invert this and use .filter() instead, but for other values this can be more tricky.

In general exclude is opposite of filter. In this case both examples works the same.
Here:
self.get_query_set().filter(user__isnull=False, modelField=x)
You select entries that field user is not null and modelField has value x
In this case:
self.get_query_set().filter(modelField=x).exclude(user__isnull=True)
First you select entries that modelField has value x(both user in null and user is not null), then you exclude entries that have field user null.
I think that in this case it would be better use first option, it looks more cleaner. But both work the same.

Related

Get part of django model depending on a variable

My code
PriceListItem.objects.get(id=tarif_id).price_eur
In my settings.py
CURRENCY='eur'
My Question:
I would like to pick the different info depending on the CURRENCY variable in settings.py
Example:
PriceListItem.objects.get(id=tarif_id).price_+settings.CURRENCY
Is it possible?
Sure. This has nothing to do with Django actually. You can reach the instance's attribute through pure Python:
getattr(PriceListItem.objects.get(id=tarif_id), 'price_'+settings.CURRENCY)
Note it might be a better idea to have a method on the model which accepts the currency as a parameter and returns the correct piece of data (through the line I wrote above, for example).
I think this should work
item = PriceListItem.objects.get(id=tarif_id)
value = getattr(item, price_+settings.CURRENCY)
In case you are only interested in that specific column, you can make the query more efficient with .values_list:
my_price = PriceListItem.objects.values_list_(
'price_{}'.format(settings.CURRENCY),
flat=True
).get(id=tarif_id)
This will only fetch that specific column from the database, which can be a (a bit) faster than first fetching the entire row into memory, and then discard all the rest later.
Here my_price is thus not a PriceListItem object, but the value that is stored for the specific price_cur column.
It will thus result in a query that looks like:
SELECT pricelistitem.price_cur
FROM pricelistitem
WHERE id=tarif_id

Equivalence of Django queryset filteration

I've been using the following two interchangeably in a Django project:
Comments.objects.filter(writer=self.request.user).latest('id')
and
Comments.objects.get(writer=self.request.user)
Are they equivalent in practice? The docs don't seem to explicitly address this.
They are not equivalent at all, but this greatly depends on the particular model. We do not know the particulars of your model Comments, but if we assume that field writer is not unique:
For the first statement:
Comments.objects.filter(writer=self.request.user).latest('id')
Returns in essence the object with the largest id amongst a queryset of all comments with the particular writer. If one takes a look at the django.db.connections['default'].queries, will see that the resulting query is a SELECT .. ORDER_BY .. LIMIT .. statement.
For the second statement:
Comments.objects.get(writer=self.request.user)
Returns the particular record fot that writer. In case there are more than one, you get a MultipleObjectsReturned exception. If no object is found, you get a DoesNotExist exception. In the event that this would be a unique field or there would be a single object by chance, the resulting query would be a SELECT .. WHERE statement which is faster.
Regarding the documentation, if you take a look at the Options.get_latest_by reference, there is more information regarding the purpose of the latest function. Think of it more of a convenience provided by Django. It is nonetheless very important to understand how Django evaluates queries and the resulting SQL, and there are always many ways to achieve the same query so it is a matter of logic.
I don't know why you would think these are equivalent, or why the docs should address this specifically.
In the case where you only have one matching Comment, yes this will give the same result. But the first version will do it via a more complex query, with an added sort on id.
If you have more than one Comment for that writer - as seems likely - the second version will always give a MultipleObjectsReturned error.
filter gives all lines corresponding your filter.
latest gives the most recent line (highest id value)
For example:
Comments.objects.filter(writer=self.request.user).latest('id') gets in first place all Comments written by self.request.user and then latest get the newest from them.
get is made to get a unique line, so :
Comments.objects.get(writer=self.request.user) will give the comment written by self.request.user. There should be only one. If a user can write many comments then you have to use filter or maybe all. It depends on what you want exactly.
More info here

Overcoming Exclude List Limit Size

I'm trying to make a query using Django's Exclude() and passing to it a list, as in:
(...).exclude(id__in=list(top_vip_deals_filter))
The problem is that, apparently, there is a Limit -- depending on your database --on the size of the list being passed.
Is this correct?
If so, How to overcome this?
If not, is there some explanation to the fact that queries silently fail when the list size is big?
Thanks
If the top_vip_deals_filter comes from the database, you can set an extra where in the query:
(...).extra(where=['model.id not in select blah blah'])
(put your lowercase model name instead of model.)
You can do better if the data model allows you to. If you can do it in SQL, you probably can do it in django.

Is there any acceptable way to chop/recombine Django querysets without using the API?

I was to forced to use a models.CharField to store some additional flags in one of my models. So I'm abusing each letter of the field as a flag. As an example, 'MM5' would mean "man, married, age above 50" and 'FS2' "female, single, age above 20".
I'm using methods to query/access these flags. Of course I cannot use these methods with the queryset API. I'm using list comprehensions calling the methods to filter an initial queryset and transform them into a plain list, which is good enough for most template feeding:
people = People.objects.filter(name__startswith='J')
people_i_want = [p for p in people if p.myflags_ismale() and p.myflags_isolderthan(30)]
So, is there any ok'ish way to retransform these lists back into a queryset? Or to chop/filter a queryset based on the output of my methods without transforming it to a normal list in the first place?
It would probably be needlessly complicated as well as bad practice to try and "re-transform" your list back into a QuerySet, the best thing to do is to use cleverer QuerySet filtering.
You should use the queryset filter by regex syntax to get the functionality you need. Documentation here.
For instance the equivalent to ismale() using regex would be something like...
People.objects.filter(myflags__regex=r'.M.') # <-- For matching something like 'MM5'
# Please note I haven't tested this regex expression, but the principal is sound
Also, while I'm admittedly not a database guru, I'm fairly certain using this sort of "flags" in a charfield is a rather inefficient way of doing things.
If you must convert a list back into a queryset, then the idiom I use is:
People.objects.filter(pk__in=[x.pk for x in list_of_objects])
Obviously, this hits the database again. But if you really need it.

How to limit columns returned by Django query?

That seems simple enough, but all Django Queries seems to be 'SELECT *'
How do I build a query returning only a subset of fields ?
In Django 1.1 onwards, you can use defer('col1', 'col2') to exclude columns from the query, or only('col1', 'col2') to only get a specific set of columns. See the documentation.
values does something slightly different - it only gets the columns you specify, but it returns a list of dictionaries rather than a set of model instances.
Append a .values("column1", "column2", ...) to your query
The accepted answer advising defer and only which the docs discourage in most cases.
only use defer() when you cannot, at queryset load time, determine if you will need the extra fields or not. If you are frequently loading and using a particular subset of your data, the best choice you can make is to normalize your models and put the non-loaded data into a separate model (and database table). If the columns must stay in the one table for some reason, create a model with Meta.managed = False (see the managed attribute documentation) containing just the fields you normally need to load and use that where you might otherwise call defer(). This makes your code more explicit to the reader, is slightly faster and consumes a little less memory in the Python process.