Alternative nullif in Django ORM

Alternative nullif in Django ORM - django

Use Postgres as db and Django 1.9
I have some model with field 'price'. 'Price' blank=True.
On ListView, I get query set. Next, I want to sort by price with price=0 at end.
How I can write in SQL it:
'ORDER BY NULLIF('price', 0) NULLS LAST'
How write it on Django ORM? Or on rawsql?

Ok. I found alternative. Write own NullIf with django func.
from django.db.models import Func
class NullIf(Func):
template = 'NULLIF(%(expressions)s, 0)'
And use it for queryset:
queryset.annotate(new_price=NullIf('price')).order_by('new_price')
Edit : Django 2.2 and above have this implemented out of the box. The equivalent code will be
from django.db.models.functions import NullIf
from django.db.models import Value
queryset.annotate(new_price=NullIf('price', Value(0)).order_by('new_price')

You can still ORDER BY PRICE NULLS LAST if in your select you select the price as SELECT NULLIF('price', 0). That way you get the ordering you want, but the data is returned in the way you want. In django ORM you would select the price with annotate eg TableName.objects.annotate(price=NullIf('price', 0) and for the order by NULLS LAST and for the order by I'd follow the recommendations here Django: Adding "NULLS LAST" to query
Otherwise you could also ORDER BY NULLIF('price', 0) DESC but that will reorder the other numeric values. You can also obviously exclude null prices from the query entirely if you don't require them.

Related

Sum aggregation over list items in Django JSONField

I'd like to calculate the sum of all elements in a list inside a JSONField via Django's ORM. The objects basically look like this:
[
{"score": 10},
{"score": 0},
{"score": 40},
...
]
There are several problems that made me use a Raw Query in the end (see SQL query below) but I'd like to know if it is possible with Django's ORM.
SELECT id,
SUM(elements.score) AS total_score
FROM my_table,
LATERAL (SELECT
(jsonb_array_elements('results')->'score')::integer AS score
) AS elements
GROUP BY id
ORDER BY total_score DESC
The main problems I faced is that the list in the JSONField needs to be turned into a set via jsonb_array_elements. Afterwards it is impossible to run an aggregate function over the results. Postgres complains:
aggregate function calls cannot contain set-returning function calls
Using a LATERAL FROM -- as widely suggested -- is not possible with the ORM. Not even with Django's .extra() queryset method because it is not possible to specify an additional table that is not quoted in the final query:
Model.objects.annotate(...).extra(
tables="LATERAL (SELECT (jsonb_array_elements('results')->'score')::integer AS score) AS elements"
)
# ERROR: no relation "LATERAL (SELECT ..."

You can annotate the queryset with the score value from the JSONField, Cast it to an integer, retrieve the distinct values, and get the sum of whatever is left. I think the following query should do the trick:
from django.db.models import IntegerField
from django.db.models import Sum
from django.db.models.fields.json import KeyTextTransform
from django.db.models.functions import Cast
Model.objects.annotate(
score=Cast(
KeyTextTransform("score", "JSONField_name"),
IntegerField(),
)
).values("score").distinct().aggregate(Sum("score"))["score__sum"]
Note that you will still have to change the JSONField_name according to your model

Annotation with a subquery with multiple result in Django

I use postgresql database in my project and I use below example from django documentation.
from django.db.models import OuterRef, Subquery
newest = Comment.objects.filter(post=OuterRef('pk')).order_by('-created_at')
Post.objects.annotate(newest_commenter_email=Subquery(newest.values('email')[:1]))
but instead of newest commenter email, i need last two commenters emails. i changed [:1] to [:2] but this exception raised: ProgrammingError: more than one row returned by a subquery used as an expression.

You'll need to aggregate the subquery results in some way: perhaps by using an ARRAY() construct.
You can create a subclass of Subquery to do this:
class Array(Subquery):
template = 'ARRAY(%(subquery)s)`
output_field = ArrayField(base_field=models.TextField())
(You can do a more automatic method of getting the output field, but this should work for you for now: see https://schinckel.net/2019/07/30/subquery-and-subclasses/ for more details).
Then you can use:
posts = Post.objects.annotate(
newest_commenters=Array(newest.values('email')[:2]),
)
The reason this is happening is because a correlated subquery in postgres may only return one row, with one column. You can use this mechanism to deal with multiple rows, and perhaps use JSONB construction if you need multiple columns.

Django count group by date from datetime

I'm trying to count the dates users register from a DateTime field. In the database this is stored as '2016-10-31 20:49:38' but I'm only interested in the date '2016-10-31'.
The raw SQL query is:
select DATE(registered_at) registered_date,count(registered_at) from User
where course='Course 1' group by registered_date;
It is possible using 'extra' but I've read this is deprecated and should not be done. It works like this though:
User.objects.all()
.filter(course='Course 1')
.extra(select={'registered_date': "DATE(registered_at)"})
.values('registered_date')
.annotate(**{'total': Count('registered_at')})
Is it possible to do without using extra?
I read that TruncDate can be used and I think this is the correct queryset however it does not work:
User.objects.all()
.filter(course='Course 1')
.annotate(registered_date=TruncDate('registered_at'))
.values('registered_date')
.annotate(**{'total': Count('registered_at')})
I get <QuerySet [{'total': 508346, 'registered_date': None}]> so there is something going wrong with TruncDate.
If anyone understands this better than me and can point me in the right direction that would be much appreciated.
Thanks for your help.

I was trying to do something very similar and was having the same problems as you. I managed to get my problem working by adding in an order_by clause after applying the TruncDate annotation. So I imagine that this should work for you too:
User.objects.all()
.filter(course='Course 1')
.annotate(registered_date=TruncDate('registered_at'))
.order_by('registered_date')
.values('registered_date')
.annotate(**{'total': Count('registered_at')})
Hope this helps?!

This is an alternative to using TruncDate by using `registered_at__date' and Django does the truncate for you.
from django.db.models import Count
from django.contrib.auth import get_user_model
metrics = {
'total': Count('registered_at__date')
}
get_user_model().objects.all()
.filter(course='Course 1')
.values('registered_at__date')
.annotate(**metrics)
.order_by('registered_at__date')
For Postgresql this transforms to the DB query:
SELECT
("auth_user"."registered_at" AT TIME ZONE 'Asia/Kolkata')::date,
COUNT("auth_user"."registered_at") AS "total"
FROM
"auth_user"
GROUP BY
("auth_user"."registered_at" AT TIME ZONE 'Asia/Kolkata')::date
ORDER BY
("auth_user"."registered_at" AT TIME ZONE 'Asia/Kolkata')::date ASC;
From the above example you can see that Django ORM reverses SELECT and GROUP_BY arguments. In Django ORM .values() roughly controls the GROUP_BY argument while .annotate() controls the SELECT columns and what aggregations needs to be done. This feels a little odd but is simple when you get the hang of it.

"SELECT field as x..." with Django ORM

Is it possible to use AS sql statement with Django ORM:
SELECT my_field AS something_shiny WHERE my_condition = 1
If it is possible then how?

By now the Django documentation says that one should use extra as a last resort.
So here is a better way to do this query:
from django.db.models import F
Foo.objects.filter(cond=1).annotate(sth_shiny=F('my_field'))

use extra()
Foo.objects.filter(cond=1).extra(select={'sth_shiny':'my_field'})
Then you could access sth_shiny attr of resulted Foo instances

Django: order by position ignoring NULL

I have a problem with Django queryset ordering.
My model contains a field named position, a PositiveSmallIntegerField which I'd like to used to order query results.
I use order_by('position'), which works great.
Problem : my position field is nullable (null=True, blank=True), because I don't wan't to specify a position for every 50000 instances of my model. When some instances have a NULL position, order_by returns them in the top of the list: I'd like them to be at the end.
In raw SQL, I used to write things like:
IF(position IS NULL or position='', 1, 0)
(see http://www.shawnolson.net/a/730/mysql-sort-order-with-null.html). Is it possible to get the same result using Django, without writing raw SQL?

You can use the annotate() from django agrregation to do the trick:
items = Item.objects.all().annotate(null_position=Count('position')).order_by('-null_position', 'position')

As of Django 1.8 you can use Coalesce() to convert NULL to 0.
Sample:
import datetime
from django.db.models.functions import Coalesce, Value
from app import models
# Coalesce works by taking the first non-null value. So we give it
# a date far before any non-null values of last_active. Then it will
# naturally sort behind instances of Box with a non-null last_active value.
the_past = datetime.datetime.now() - datetime.timedelta(days=10*365)
boxes = models.Box.objects.all().annotate(
new_last_active=Coalesce(
'last_active', Value(the_past)
)
).order_by('-new_last_active')

It's a shame there are a lot of questions like this on SO that are not marked as duplicate. See (for example) this answer for the native solution for Django 1.11 and newer. Here is a short excerpt:
Added the nulls_first and nulls_last parameters to Expression.asc() and desc() to control the ordering of null values.
Example usage (from comment to that answer):
from django.db.models import F
MyModel.objects.all().order_by(F('price').desc(nulls_last=True))
Credit goes to the original answer author and commenter.

Using extra() as Ignacio said optimizes a lot the end query. In my aplication I've saved more than 500ms (that's a lot for a query) in database processing using extra() instead of annotate()
Here is how it would look like in your case:
items = Item.objects.all().extra(
'select': {
'null_position': 'CASE WHEN {tablename}.position IS NULL THEN 0 ELSE 1 END'
}
).order_by('-null_position', 'position')
{tablename} should be something like {Item's app}_item following django's default tables name.

I found that the syntax in Pablo's answer needed to be updated to the following on my 1.7.1 install:
items = Item.objects.all().extra(select={'null_position': 'CASE WHEN {name of Item's table}.position IS NULL THEN 0 ELSE 1 END'}).order_by('-null_position', 'position')

QuerySet.extra() can be used to inject expressions into the query and order by them.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Alternative nullif in Django ORM - django

Use Postgres as db and Django 1.9 I have some model with field 'price'. 'Price' blank=True. On ListView, I get query set. Next, I want to sort by price with price=0 at end. How I can write in SQL it: 'ORDER BY NULLIF('price', 0) NULLS LAST' How write it on Django ORM? Or on rawsql?

Related

Sum aggregation over list items in Django JSONField

Annotation with a subquery with multiple result in Django

Django count group by date from datetime

"SELECT field as x..." with Django ORM

Django: order by position ignoring NULL

Categories

Resources