django aggregate for multiple days - django

I have a model which has two attributes: date and length and others which are not relevant. And I need to display list of sums of length for each day in template.
The solution I've used so far is looping day by day and creating list of sums using aggregations like:
for day in month:
sums.append(MyModel.objects.filter(date=date).aggregate(Sum('length')))
But it seems very ineffective to me because of the number of db lookups. Isn't there a better way to do this? Like caching everything and then filter it without touching the db?

.values() can be used to group by date, so you will only get unique dates together with the sum of length fields via .annotate():
>>> from django.db.models import Sum
>>> MyModel.objects.values('date').annotate(total_length=Sum('length'))
From docs:
When .values() clause is used to constrain the columns that are returned in the result set, the method for evaluating annotations is slightly different. Instead of returning an annotated result for each result in the original QuerySet, the original results are grouped according to the unique combinations of the fields specified in the .values() clause.
Hope this helps.

Related

Django query to fetch top performers for each month

I need to fetch the top performer for each month, here is the below MySql query which gives me the correct output.
select id,Name,totalPoints, createdDateTime
from userdetail
where app=4 and totalPoints in ( select
max(totalPoints)
FROM userdetail
where app=4
group by month(createdDateTime), year(createdDateTime))
order by totalPoints desc
I am new to Django ORM. I am not able to write an equivalent Django query which does the task. I have been struggling with this logic for 2 days. Any help would be highly appreciated.
While the GROUP BY clause in a subquery is slightly difficult to express with the ORM because aggregate() operations don't emit querysets, a similar effect can be achieved with a Window function:
UserDetail.objects.filter(total_points__in=UserDetail.objects.annotate(max_points=Window(
expression=Max('total_points'),
partition_by=[Trunc('created_datetime', 'month')]
)).values('max_points')
)
In general, this sort of pattern is implemented with Subquery expressions. In this case, I've implicitly used a subquery by passing a queryset to an __in predicate.
The Django documentation's notes on using aggregates within subqueries is are also relevant to this sort of query, since you want to use the results of an aggregate in a subquery (which I've avoided by using a window function).
However, I believe your query may not correctly capture what you want to do: as written it could return rows for users who weren't the best in a given month but did have the same score as another user who was the best in any month.

django annotate queryset with field comparison result

I have a queryset like this:
predicts = Prediction.objects.select_related('match').filter(match_id=pk)
I need to annotate this with a new field is_correct. I need to compare two string fields and the result should be annotated in this new field. the fields that I want to compare are:
predict from Prediction table
result from Match table (that has been joined through select_related)
I need to know what expression should I put inside my annotate function; below I have my current code which throughs a TypeError exception:
predicts = predicts.annotate(is_correct=(F('predict') == F('result')))
all help will be greatly appreciated.
UPDATE:
I found an alternative solution that does the job for me (filtering the Prediction based on Match result using filter and exclude), but I still like to know how to address this specific case where the new annotated field is the result of the comparison between two other fields of the queryset. For those who may need it, in Django 2.2 and later the Nullif database function does a comparison between two fields.
You can use the extra function, a hook for injecting specific clauses into the SQL.
First of all, we must know the names of the apps and the models, or the name of the tables in the database.
Assuming that in your case, the two tables are called "app_prediction" and "app_match".
The sentence would be as follows:
Prediction.objects.select_related('match').extra(
select={'is_correct': "app_prediction.predict = app_match.result"}
)
This will add a field called is_correct in your result,
in the database, the fields and tables must be called in the same way.
It would be best to see the models.

filter statement based on calculations django orm

how can I translate query like this in django orm?
select id, year, month
where (year*100 + month) between 201703 and 201801
Thanks in advance for any help
You can first create an annotation, and then filter on that:
from django.db.models import F
(Modelname.objects
.annotate(yearmonth=F('year')*100+F('month'))
.filter(yearmonth__range=(201703, 201801)))
So here we construct an annotation yearmonth (you can use another name if you like), and make it equal to the year column times 100 plus the month column. Next we can filter on that annotation, and do so by specifying a __range here with two bounds.
Normally this will work for any database system that performs the operations you here perform (multiplying a column with a constant number, adding two values together), as well as do the __range filter (in MySQL this is translated into <var> BETWEEN <min> AND <max>). Since we use Django ORM however, if we later decide to use another database, the query will be translated in the other database query language (given of course that is possible).
How about using something similar to this.
Did you try filter and __range
Created_at will be the field in your DB
ModelName.objects.filter(created_at__range=(start_date, end_date))
Later you can do calculation in your view this is just a workaround.
If you want to run the exactly same query then probably you can run using.
ModelName.objects.raw("select id, year, month
where (year*100 + month) between 201703 and 201801")

Filtering QuerySet by __count of RelatedManager

I've got a QuerySet I'd like to filter by the count of a related_name. Currently I've got something like this:
objResults = myObjects.filter(Q(links_by_source__status=ACCEPTED),Q(links_by_source__count=1))
However, when I run this I get the following error message:
Cannot resolve keyword 'count' into field
I'm guessing that this query is operating individually on each of the links_by_source connections, therefore there is no count function since it's not a QuerySet I'm working with. Is there a way of filtering so that, for each object returned, the number of links_by_source is exactly 1?
You need to use an aggregation function to get the count before you can filter on it.
from django.db.models import Count
myObjects.filter(
links_by_source__status=ACCEPTED).annotate(link_count=Count('links_by_source')
).filter(link_count=1)
Note, you should pay attention to the order of the annotate and filter here: that query counts the number of ACCEPTED links, not sure if you want that or you want to check that the total count of all links is 1.

Django annotation with nested filter

Is it possible to filter within an annotation?
In my mind something like this (which doesn't actually work)
Student.objects.all().annotate(Count('attendance').filter(type="Excused"))
The resultant table would have every student with the number of excused absences. Looking through documentation filters can only be before or after the annotation which would not yield the desired results.
A workaround is this
for student in Student.objects.all():
student.num_excused_absence = Attendance.objects.filter(student=student, type="Excused").count()
This works but does many queries, in a real application this can get impractically long. I think this type of statement is possible in SQL but would prefer to stay with ORM if possible. I even tried making two separate queries (one for all students, another to get the total) and combined them with |. The combination changed the total :(
Some thoughts after reading answers and comments
I solved the attendance problem using extra sql here.
Timmy's blog post was useful. My answer is based off of it.
hash1baby's answer works but seems equally complex as sql. It also requires executing sql then adding the result in a for loop. This is bad for me because I'm stacking lots of these filtering queries together. My solution builds up a big queryset with lots of filters and extra and executes it all at once.
If performance is no issue - I suggest the for loop work around. It's by far the easiest to understand.
As of Django 1.8 you can do this directly in the ORM:
students = Student.objects.all().annotate(num_excused_absences=models.Sum(
models.Case(
models.When(absence__type='Excused', then=1),
default=0,
output_field=models.IntegerField()
)))
Answer adapted from another SO question on the same topic
I haven't tested the sample above but did accomplish something similar in my own app.
You are correct - django does not allow you to filter the related objects being counted, without also applying the filter to the primary objects, and therefore excluding those primary objects with a no related objects after filtering.
But, in a bit of abstraction leakage, you can count groups by using a values query.
So, I collect the absences in a dictionary, and use that in a loop. Something like this:
# a query for students
students = Students.objects.all()
# a query to count the student attendances, grouped by type.
attendance_counts = Attendence(student__in=students).values('student', 'type').annotate(abs=Count('pk'))
# regroup that into a dictionary {student -> { type -> count }}
from itertools import groupby
attendance_s_t = dict((s, (dict(t, c) for (s, t, c) in g)) for s, g in groupby(attendance_counts, lambda (s, t, c): s))
# then use them efficiently:
for student in students:
student.absences = attendance_s_t.get(student.pk, {}).get('Excused', 0)
Maybe this will work for you:
excused = Student.objects.filter(attendance__type='Excused').annotate(abs=Count('attendance'))
You need to filter the Students you're looking for first to just those with excused absences and then annotate the count of them.
Here's a link to the Django Aggregation Docs where it discusses filtering order.