annotate group by in django - django

I'm trying to perform a query in django that is equivalent to this:
SELECT SUM(quantity * price) from Sales GROUP BY date.
My django query looks like this:
Sales.objects.values('date').annotate(total_sum=Sum('price * quantity'))
The above one throws error:
Cannot resolve keyword 'price * quantity' into field
Then I came up with another query after checking this https://stackoverflow.com/a/18220269/12113049
Sales.objects.values('date').annotate(total_sum=Sum('price', field='price*quantity'))
Unfortunately, this is not helping me much. It gives me SUM(price) GROUP BY date instead of SUM(quantity*price) GROUP BY date.
How do I query this in django?

You should be using F expressions to perform operations on fields:
from django.db.models import F
Sales.objects.values('date').annotate(total_sum=Sum(F('price') * F('quantity')))
Edit: assuming that price is a DecimalField and quantity is a IntegerField (of different types) you would need to specify output_field in Sum:
from django.db.models import DecimalField, F
Sales.objects.values('date').annotate(total_sum=Sum(F('price') * F('quantity'), output_field=DecimalField()))

Related

Sum aggregation over list items in Django JSONField

I'd like to calculate the sum of all elements in a list inside a JSONField via Django's ORM. The objects basically look like this:
[
{"score": 10},
{"score": 0},
{"score": 40},
...
]
There are several problems that made me use a Raw Query in the end (see SQL query below) but I'd like to know if it is possible with Django's ORM.
SELECT id,
SUM(elements.score) AS total_score
FROM my_table,
LATERAL (SELECT
(jsonb_array_elements('results')->'score')::integer AS score
) AS elements
GROUP BY id
ORDER BY total_score DESC
The main problems I faced is that the list in the JSONField needs to be turned into a set via jsonb_array_elements. Afterwards it is impossible to run an aggregate function over the results. Postgres complains:
aggregate function calls cannot contain set-returning function calls
Using a LATERAL FROM -- as widely suggested -- is not possible with the ORM. Not even with Django's .extra() queryset method because it is not possible to specify an additional table that is not quoted in the final query:
Model.objects.annotate(...).extra(
tables="LATERAL (SELECT (jsonb_array_elements('results')->'score')::integer AS score) AS elements"
)
# ERROR: no relation "LATERAL (SELECT ..."
You can annotate the queryset with the score value from the JSONField, Cast it to an integer, retrieve the distinct values, and get the sum of whatever is left. I think the following query should do the trick:
from django.db.models import IntegerField
from django.db.models import Sum
from django.db.models.fields.json import KeyTextTransform
from django.db.models.functions import Cast
Model.objects.annotate(
score=Cast(
KeyTextTransform("score", "JSONField_name"),
IntegerField(),
)
).values("score").distinct().aggregate(Sum("score"))["score__sum"]
Note that you will still have to change the JSONField_name according to your model

How to filter a query on Django ORM

I know this should be pretty basic but somehow I don't quite get it. I want to get all users which age is lesser than 18 so the query should be something like this
User.objects.filter(age < 18)
what Im doing wrong?
In order to filter with a less than filter, you use the __lt lookup [Django-doc]:
User.objects.filter(age__lt=18)
or if you want to filter on a property that is some expression of fields, you can first annotate:
from django.db.models import F
User.objects.annotate(
age=F('age1') + F('age2')
).filter(age__lt=18)
or if you want to subtract a number:
from django.db.models import F
User.objects.annotate(
ageminus=F('age') - 5
).filter(ageminus__lt=18)
In this example the User object has no age field, but an age1 and age2 field. First we thus introduce an annotation age, that is the sum of these two fields.
By writing .filter(age < 18), the Python interpreter will look for a variable named age, and if that indeed exists (not per), then it will compare that with 18, and pass the result as a positional parameter to filter(..). So unless you use some proxy objects, like SqlAlchemy does, that will not work.

Django ORM query - Sum inside annotate using when condition

I have a table, lets call it as DummyTable.
It has fields - price_effective, store_invoice_updated_date, bag_status, gstin_code.
Now I want to get the output which does a group by of - month, year from the field store_invoice_updated_date and gstin_code.
Along with that group by I wanna do thse calculations -
Sum of price_effective as 'forward_price_effective' if the bag_status is other than 'return_accepted' or 'rto_bag_accepted'. Dont know how to do an exclude here i.e. using a filter in annotate
Sum of price effective as 'return_price_effective' if the bag_status is 'return_accepted' or 'rto_bag_accepted'.
A field 'total_price' that subtracts the 'return_price_effective' from 'forward_price_effective'.
I have formulated this query, which doesn't work
from django.db.models.functions import TruncMonth
from django.db.models import Count, Sum, When, Case, IntegerField
DummyTable.objects.annotate(month=TruncMonth('store_invoice_updated_date'), year=TruncYear('store_invoice_updated_date')).annotate(forward_price_effective=Sum(Case(When(bag_status__in=['delivery_done']), then=Sum(forward_price_effective)), output_field=IntegerField()), return_price_effective=Sum(Case(When(bag_status__in=['return_accepted', 'rto_bag_accepted']), then=Sum('return_price_effective')), output_field=IntegerField())).values('month','year','forward_price_effective', 'return_price_effective', 'gstin_code')
Solved it by multiple querysets.
Just couldnt find out a way to appropriately use 'Case' with 'When' with 'filter' and 'exclude'.
basic_query = BagDetails.objects.filter(store_invoice_updated_date__year__in=[2018]).annotate(month=TruncMonth('store_invoice_updated_date'), year=TruncYear('store_invoice_updated_date') ).values('year', 'month', 'gstin_code', 'price_effective', 'company_id', 'bag_status')
forward_bags = basic_query.exclude(bag_status__in=['return_accepted', 'rto_bag_accepted']).annotate(
Sum('price_effective')).values('year', 'month', 'gstin_code', 'price_effective', 'company_id')
return_bags = basic_query.filter(bag_status__in=['return_accepted', 'rto_bag_accepted']).annotate(
Sum('price_effective')).values('month', 'gstin_code', 'price_effective', 'company_id')

Alternative nullif in Django ORM

Use Postgres as db and Django 1.9
I have some model with field 'price'. 'Price' blank=True.
On ListView, I get query set. Next, I want to sort by price with price=0 at end.
How I can write in SQL it:
'ORDER BY NULLIF('price', 0) NULLS LAST'
How write it on Django ORM? Or on rawsql?
Ok. I found alternative. Write own NullIf with django func.
from django.db.models import Func
class NullIf(Func):
template = 'NULLIF(%(expressions)s, 0)'
And use it for queryset:
queryset.annotate(new_price=NullIf('price')).order_by('new_price')
Edit : Django 2.2 and above have this implemented out of the box. The equivalent code will be
from django.db.models.functions import NullIf
from django.db.models import Value
queryset.annotate(new_price=NullIf('price', Value(0)).order_by('new_price')
You can still ORDER BY PRICE NULLS LAST if in your select you select the price as SELECT NULLIF('price', 0). That way you get the ordering you want, but the data is returned in the way you want. In django ORM you would select the price with annotate eg TableName.objects.annotate(price=NullIf('price', 0) and for the order by NULLS LAST and for the order by I'd follow the recommendations here Django: Adding "NULLS LAST" to query
Otherwise you could also ORDER BY NULLIF('price', 0) DESC but that will reorder the other numeric values. You can also obviously exclude null prices from the query entirely if you don't require them.

Django + PostgreSQL group by date on datetimefield

I have a model which has a datetimefield that I'm trying to annotate on grouping by date.
Eg:
order_totals = Transfer.objects.filter(created__range=[datetime.datetime.combine(datetime.date.today(), datetime.time.min) + datetime.timedelta(days=-5), datetime.datetime.combine(datetime.date.today(), datetime.time.max)]).values('created').annotate(Count('id'))
The problem with the above is it groups by every second/millisecond of the datetime field rather then just the date.
How would I do this?
You should be able to solve this by using QuerySet.extra and add a column to the query
eg.
qs.filter(...).extra(select={'created_date': 'created::date'}).values('created_date')
Starting on Django 1.8, you can also use the new DateTime expression (weirdly it's is not documented in the built-in expressions sheet).
import pytz
from django.db.models.expressions import DateTime
qs.annotate(created_date=DateTime('created', 'day', pytz.UTC))
If you want to group by created_date, just chain another aggregating expression :
qs.annotate(created_date=DateTime('created', 'day', pytz.UTC)).values('created_date').annotate(number=Count('id'))
(Redundant values is needed to generate the appropriate GROUP BY. See aggregation topic in Django documentation).