I am using Django's aggregate query expression to total some values. The final value is a division expression that may sometimes feature zero as a denominator. I need a way to escape if this is the case, so that it simply returns 0.
I've tried the following, as I've been using something similar my annotate expressions:
from django.db.models import Sum, F, FloatField, Case, When
def for_period(self, start_date, end_date):
return self.model.objects.filter(
date__range=(start_date, end_date)
).aggregate(
sales=Sum(F("value")),
purchase_cogs=Sum(F('purchase_cogs')),
direct_cogs=Sum(F("direct_cogs")),
profit=Sum(F('profit'))
).aggregate(
margin=Case(
When(sales=0, then=0),
default=(Sum(F('profit')) / Sum(F('value')))*100
)
)
However, it obviously doesn't work, because as the error says:
'dict' object has no attribute 'aggregate'
What is the proper way to handle this?
This will obviously not work; because aggregate returns a dictionary, not a QuerySet (see the docs), so you can't chain two aggregate calls together.
I think using annotate will solve your issue. annotate is almost identical to aggregate, except in that it returns a QuerySet with the results saved as attributes rather than return a dictionary. The result is that you can chain annotate calls, or even call annotate then aggregate.
So I believe something like:
return self.model.objects.filter(
date__range=(start_date, end_date)
).annotate( # call `annotate`
sales=Sum(F("value")),
purchase_cogs=Sum(F('purchase_cogs')),
direct_cogs=Sum(F("direct_cogs")),
profit=Sum(F('profit'))
).aggregate( # then `aggregate`
margin=Case(
When(sales=0, then=0),
default=(Sum(F('profit')) / Sum(F('value')))*100
)
)
should work.
Hope this helps.
I've made it work (in Django 2.0) with:
from django.db.models import Case, F, FloatField, Sum, When
aggr_results = models.Result.objects.aggregate(
at_total_units=Sum(F("total_units")),
ag_pct_units_sold=Case(
When(at_total_units=0, then=0),
default=Sum("sold_units") / (1.0 * Sum("total_units")) * 100,
output_field=FloatField(),
),
)
You can't chain together aggregate statements like that. The docs say:
aggregate() is a terminal clause for a QuerySet that, when invoked,
returns a dictionary of name-value pairs.
It returns a python dict, so you'll need to figure out a way to modify your query to do it all at once. You might be able to replace the first call to aggregate with annotate instead, as it returns a queryset:
Unlike aggregate(), annotate() is not a terminal clause. The output of
the annotate() clause is a QuerySet
As for the division by 0 possibility, you could wrap your code in a try catch block, watching for ZeroDivisionError.
Related
In use: django 3.2.10, postgresql 13.4
I have next query set with aggregation function Count
queryset = Model.objects.all().aggregate(
trues=Count('id', filter=Q(criteria=True)),
falses=Count('id', filter=Q(criteria=False)),
)
What I want:
queryset = Model.objects.all().aggregate(
trues=Count('id', filter=Q(criteria=True)),
falses=Count('id', filter=Q(criteria=False)),
total=trues+falses, <--------------THIS
)
How to do this?
There is little thing you can do after aggregation, as it returns a python dict object.
I do understand your example here is not your real situation, as you can simply do
Model.objects.aggregate(
total = (Count('id', filter=Q(criteria=True))
+ Count('id', filter=Q(criteria=False)))
)
What I want to say is Django provides .values().annotate() to achieve GROUP BY clause as in sql language.
Take your example here
queryset = Model.objects.values('criteria').annotate(count=Count('id'))
queryset here is still a 'QuerySet' object, and you can further modify the queryset like
queryset = queryset.aggregate(
total=Sum('count')
)
Hopefully it helps.
it seems you want the total number of false and true criteria so you can simply do as follow
queryset = Model.objects.all().filter(
Q(criteria=True) | Q(criteria=False)).count()
or you can use (not recommended except you want to show something in the middle)
from django.db.models import Avg, Case, Count, F, Max, Min, Prefetch, Q, Sum, When
query = Model.objects.annotate(trues=Count('id',filter=Q(criteria=True)),
falses=Count('id',filter=Q(criteria=False))).annotate(trues_false=F('trues')+F('falses')).aggregate(total=Sum('trues_false'))
I want to do
SELECT [field1], ST_Area(ST_Union(geometry), True) FROM table [group by field1]
Or, written in another words, how do I apply a function over an aggregate result? ST_Union is an aggregate. [field1] is just free notation to say I'd like to run both queries with or without this group by.
Also, ST_Area with 2 arguments seem not to be available on django gis helpers, so it must probably be written using Func.
Also, I want to be able to also aggregate by everything (not provide a groupBy) but django seems to add a group by id if I don't provide any .values() to the queryset.
This seems very confusing. I can't get my head around annotates and aggregates. Thank you!
Apparently I can normally chain aggregates, like
from django.contrib.gis.db.models import Union, GeometryField
from django.contrib.gis.db.models.functions import Transform, Area
qs = qs.annotate(area_total=Area(Transform(Union("geometry"), 98056)))
The issue I was encountering was that I was attemping to use Func() expressions. In order to chain another function in the 1st parameter of Func, it must be wrapped with ExpressionWrapper or something else.
qs = qs.annotate(
area_total=Func(
ExpressionWrapper(Union("geometry"), output_field=GeometryField()),
True,
function="ST_Area",
output_field=FloatField(),
)
)
I have a filter which should return a queryset with 2 objects, and should have one different field. for example:
obj_1 = (name='John', age='23', is_fielder=True)
obj_2 = (name='John', age='23', is_fielder=False)
Both the objects are of same model, but different primary key. I tried usign the below filter:
qs = Model.objects.filter(name='John', age='23').annotate(is_fielder=F('plays__outdoor_game_role')=='Fielder')
I used annotate first time, but it gave me the below error:
TypeError: QuerySet.annotate() received non-expression(s): False.
I am new to Django, so what am I doing wrong, and what should be the annotate to get the required objects as shown above?
The solution by #ktowen works well, quite straightforward.
Here is another solution I am using, hope it is helpful too.
queryset = queryset.annotate(is_fielder=ExpressionWrapper(
Q(plays__outdoor_game_role='Fielder'),
output_field=BooleanField(),
),)
Here are some explanations for those who are not familiar with Django ORM:
Annotate make a new column/field on the fly, in this case, is_fielder. This means you do not have a field named is_fielder in your model while you can use it like plays.outdor_game_role.is_fielder after you add this 'annotation'. Annotate is extremely useful and flexible, can be combined with almost every other expression, should be a MUST-KNOWN method in Django ORM.
ExpressionWrapper basically gives you space to wrap a more complecated combination of conditions, use in a format like ExpressionWrapper(expression, output_field). It is useful when you are combining different types of fields or want to specify an output type since Django cannot tell automatically.
Q object is a frequently used expression to specify a condition, I think the most powerful part is that it is possible to chain the conditions:
AND (&): filter(Q(condition1) & Q(condition2))
OR (|): filter(Q(condition1) | Q(condition2))
Negative(~): filter(~Q(condition))
It is possible to use Q with normal conditions like below:
(Q(condition1)|id__in=[list])
The point is Q object must come to the first or it will not work.
Case When(then) can be simply explained as if con1 elif con2 elif con3 .... It is quite powerful and personally, I love to use this to customize an ordering object for a queryset.
For example, you need to return a queryset of watch history items, and those must be in an order of watching by the user. You can do it with for loop to keep the order but this will generate plenty of similar queries. A more elegant way with Case When would be:
item_ids = [list]
ordering = Case(*[When(pk=pk, then=pos)
for pos, pk in enumerate(item_ids)])
watch_history = Item.objects.filter(id__in=item_ids)\
.order_by(ordering)
As you can see, by using Case When(then) it is possible to bind those very concrete relations, which could be considered as 1) a pinpoint/precise condition expression and 2) especially useful in a sequential multiple conditions case.
You can use Case/When with annotate
from django.db.models import Case, BooleanField, Value, When
Model.objects.filter(name='John', age='23').annotate(
is_fielder=Case(
When(plays__outdoor_game_role='Fielder', then=Value(True)),
default=Value(False),
output_field=BooleanField(),
),
)
I want to filter on objects that only have related objects with values in a finite set - here's how I tried to write it:
trips = Trip.objects\
.filter(study=study, field_values__field__name='mode', field_values__int_value__in=modes)\
.exclude(study=study, field_values__field__name='mode', field_values__int_value__not_in=modes)\
.all()
I think this would work, except 'not in' is not a valid operator. Unfortunately, 'not modes' here is an infinite set - it could be any int not in modes, so I can't 'exclude in [not modes].'
How can I write this with a Django query?
You can filter this with:
from django.db.models import Count, F, Q
Trip.objects.filter(
study=study,
field__values__field__name='mode'
).annotate(
total_values=Count('field_values')
).filter(
total_values=Count('field_values', filter=Q(field_values__int_value__in=modes)),
total_values__gt=0
)
Here we thus count the total number of related field_values with name_model, and the ones where the int_value is in the given modes. If both are the same, we know that no value exists outside of this.
class Zone(Model):
...
class Flight(Model):
zones = ManyToManyField(Zone)
flights = Flight.objects.filter(...)
qs1 = Zone.objects.annotate(
count=flights.filter(zones__pk=F('pk')).distinct().count(), # this is not valid expression
)
Despite having F inside queryset with count() in annotation it still throw an error TypeError: QuerySet.annotate() received non-expression(s): 0. meaning that that queryset was executed in place.
Also doesn't work, but this time it just returns invalid value (always 1, always counting Zone single object instead of what inside filter):
qs1 = Zone.objects.annotate(
count=Count('pk', filter=flights.filter(zones__pk=F('pk'))), # with 'flight' instead of first 'pk' it also doesn't work
)
A .count() is evaluated eagerly in Django, so Django will try to evaluate the flights.filter(zones__pk=F('pk')).distinct().count(), and succeed to do so, since F('pk') will count the number of fligts where there are zones that happen to have the same primary key as the primary key of the Flight. You will need to use OuterRef [Django-doc], and an .annotate(..) on the subquery.
But you make this too complex. You can simply annotate with:
from django.db.models import Q, Sum
Zone.objects.annotate(
count=Count('flight', distinct=True, filter=Q(flight__…))
)
Here the filter=Q(flight__…) is the part of the filter of your flights. So if the Flights are filtered by a hypothetical active=True, you filter with:
Zone.objects.annotate(
count=Count('flight', distinct=True, filter=Q(flight__active=True))
)