How to exclude items with identical field if the datefield is bigger than in others duplicates? - django

So I have a Comments model and by querying
comments = Comments.objects.values('students_id', 'created_at')
I get this output
<QuerySet [
{'students_id': 4, 'created_at': datetime.date(2019, 6, 19)}, {'students_id': 2, 'created_at': datetime.date(2019, 6, 3)}, {'students_id': 1, 'created_at': datetime.date(2019, 6, 24)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 4)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 19)}, {'students_id': 5, 'created_at': datetime.date(2019, 6, 5)}, {'students_id': 4, 'created_at': datetime.date(2019, 7, 28)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 11)}]>
It's three comments by student with id=6 and two comments by student with id=4.
What I need to get is only one latest comment from every student. In this example it'll look like this:
<QuerySet [
{'students_id': 2, 'created_at': datetime.date(2019, 6, 3)}, {'students_id': 1, 'created_at': datetime.date(2019, 6, 24)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 19)}, {'students_id': 5, 'created_at': datetime.date(2019, 6, 5)}, {'students_id': 4, 'created_at': datetime.date(2019, 7, 28)},]>
Thanks in advance for the answer!

You can use annotate and max to get desired result like this Comments.objects.values('students_id').annotate(Max('created_at'))
and the output will be like this <QuerySet [
{'students_id': 2, 'created_at__max': datetime.date(2019, 6, 3)}, {'students_id': 1, 'created_at__max': datetime.date(2019, 6, 24)},]> which will have students_id and latest created_at. To use this you have to import Max from django.db.models like this from django.db.models import Max

use this code :
queryset=Comments.objects.values('students_id', 'created_at').group_by('students_id').annotate(Latest_created_at=Max('created_at'))
queryset.delete()

In raw SQL it would be ... WHERE NOT EXISTS(SELECT * FROM Comments cc WHERE cc.student_id = c.student_id AND cc.created_at > c.created_at)
later_comments = Comments.objects.filter(student_id=OuterRef('student_id'),
created_at__gt=OuterRef('created_at'), ).values('created_at', )
latest_comments = Comments.objects.\
annotate(has_later_comments=Exists(later_comments), ).\
filter(has_later_comments=False, )
If your created_at is a Date column (no time), then you need to use => instead of > because perhaps more than one comment can be created during a day. So the query would contain additional predicate with extra column for ordering comments (like id): WHERE cc.created_at > c.created_at OR cc.created_at = c.created_at AND cc.id > c.id
https://docs.djangoproject.com/en/2.2/ref/models/expressions/#exists-subqueries

Related

Get cumsum from aggregated field with Django orm

In my project, I want to get the sum of an "amount" field which comes form an aggregate. I've read some posts on this but I can't find a way to achieve what I want.
Example model:
class ScheduledOperation:
day = models.dateField()
amount = models.DecimalField(...)
Example queryset
{'day': datetime.date(2023, 2, 7), 'amount': Decimal('-500.00')} # same day each month
{'day': datetime.date(2023, 2, 7), 'amount': Decimal('1500.00')} # same day each month
{'day': datetime.date(2023, 3, 7), 'amount': Decimal('-500.00')}
{'day': datetime.date(2023, 3, 7), 'amount': Decimal('1500.00')}
{'day': datetime.date(2023, 4, 7), 'amount': Decimal('-500.00')}
{'day': datetime.date(2023, 4, 7), 'amount': Decimal('1500.00')}
{'day': datetime.date(2023, 5, 7), 'amount': Decimal('-500.00')}
{'day': datetime.date(2023, 5, 7), 'amount': Decimal('1500.00')}
{'day': datetime.date(2023, 5, 8), 'amount': Decimal('-4000.00')} # big op here
Where I am so far
ScheduledOperation.objects.order_by('day').values('day').annotate(day_tot=Sum('amount')) gives me the total amount for each day:
{'day': datetime.date(2023, 2, 7), 'day_tot': Decimal('1000')}
{'day': datetime.date(2023, 3, 7), 'day_tot': Decimal('1000')}
{'day': datetime.date(2023, 4, 7), 'day_tot': Decimal('1000')}
{'day': datetime.date(2023, 5, 7), 'day_tot': Decimal('1000')}
{'day': datetime.date(2023, 5, 8), 'day_tot': Decimal('-4000')}
What I want
{'day': datetime.date(2023, 2, 7), 'day_tot': Decimal('1000'), 'cumul_amount':Decimal('1000')}
{'day': datetime.date(2023, 3, 7), 'day_tot': Decimal('1000'), 'cumul_amount':Decimal('2000')}
{'day': datetime.date(2023, 4, 7), 'day_tot': Decimal('1000'), 'cumul_amount':Decimal('3000')}
{'day': datetime.date(2023, 5, 7), 'day_tot': Decimal('1000'), 'cumul_amount':Decimal('4000')}
{'day': datetime.date(2023, 5, 8), 'day_tot': Decimal('-4000'), 'cumul_amount':Decimal('0')}
What I tried
After reading other related posts on this subject, I've tried to use the Window function:
self.coming_scheduled_ops.order_by('day').values('day').annotate(
day_tot=Sum('amount')
).annotate(
cumul_amount=Window(
Sum('amount'),order_by='day'
)
)
but this does not work:
{'day': datetime.date(2023, 2, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('1500')}
{'day': datetime.date(2023, 3, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('3000')}
{'day': datetime.date(2023, 4, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('4500')}
{'day': datetime.date(2023, 5, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('6000')}
{'day': datetime.date(2023, 5, 8), 'day_tot': Decimal('-4000'), 'cumul_amount': Decimal('2000')}
I can't use Window(Sum('day_tot')), it throws django.core.exceptions.FieldError: Cannot compute Sum('day_tot'): 'day_tot' is an aggregate
Could someone help me understand the Window function plz?
I'm not sure if that's possible using Django ORM, but you can easily do this using Python.
queryset = ScheduledOperation.objects.all()
queryset = (
queryset.order_by('day')
.values('day')
.annotate(day_tot=Sum('amount'))
)
cumul_amount = 0
for day in queryset:
cumul_amount += day['day_tot']
day['cumul_amount'] = cumul_amount
It results in:
{'day': datetime.date(2023, 2, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('1000')}
{'day': datetime.date(2023, 3, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('2000')}
{'day': datetime.date(2023, 4, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('3000')}
{'day': datetime.date(2023, 5, 7), 'day_tot': Decimal('1000'), 'cumul_amount': Decimal('4000')}
{'day': datetime.date(2023, 5, 8), 'day_tot': Decimal('-4000'), 'cumul_amount': Decimal('0')}

django annotate count is giving wrong output

Suppose
class Comment(models.Model):
...
likes = models.ManyToManyField(User,...)
class Post
...
content = models.CharField(...)
likes = models.ManyToManyFiled(User,...)
comment = models.ManyToManyField(Comment,...)
Now if I run
Statement1
Post.objects.annotate(likecount=Count('likes')).values('content','likecount')
Output:
<QuerySet [{'content': 'delta', 'likecount': 3}, {'content': 'gamma', 'likecount': 6}, {'content': 'beta', 'likecount': 7}, {'content': 'alpha', 'likecount': 3}]>
Statement2
Post.objects.annotate(commentlikecount=Count('comment__likes')).values('content','commentlikecount')
Output:
<QuerySet [{'content': 'delta', 'commentlikecount': 6}, {'content': 'gamma', 'commentlikecount': 0}, {'content': 'beta', 'commentlikecount': 3}, {'content': 'alpha', 'commentlikecount': 0}]>
Statement3
Post.objects.annotate(likecount=Count('likes'),commentlikecount=Count('comment__likes')).values('content','likecount','commentlikecount')
Output:
<QuerySet [{'content': 'delta', 'likecount': 18, 'commentlikecount': 18}, {'content': 'gamma', 'likecount': 6, 'commentlikecount': 0}, {'content': 'beta', 'likecount': 21, 'commentlikecount': 21}, {'content': 'alpha', 'likecount': 3, 'commentlikecount': 0}]>
Why the output of third statement is this instead of
<QuerySet [{'content': 'delta', 'likecount': 3, 'commentlikecount': 6}, {'content': 'gamma', 'likecount': 6, 'commentlikecount': 0}, {'content': 'beta', 'likecount': 7, 'commentlikecount': 3}, {'content': 'alpha', 'likecount': 3, 'commentlikecount': 0}]>
How can i have this as output?

Django ORM queryset equivalent to group by year-month?

I have an Django app and need some datavisualization and I am blocked with ORM.
I have a models Orders with a field created_at and I want to present data with a diagram bar (number / year-month) in a dashboard template.
So I need to aggregate/annotate data from my model but did find a complete solution.
I find partial answer with TruncMonth and read about serializers but wonder if there is a simpliest solution with Django ORM possibilities...
In Postgresql it would be:
SELECT date_trunc('month',created_at), count(order_id) FROM "Orders" GROUP BY date_trunc('month',created_at) ORDER BY date_trunc('month',created_at);
"2021-01-01 00:00:00+01" "2"
"2021-02-01 00:00:00+01" "3"
"2021-03-01 00:00:00+01" "3"
...
example
1 "2021-01-04 07:42:03+01"
2 "2021-01-24 13:59:44+01"
3 "2021-02-06 03:29:11+01"
4 "2021-02-06 08:21:15+01"
5 "2021-02-13 10:38:36+01"
6 "2021-03-01 12:52:22+01"
7 "2021-03-06 08:04:28+01"
8 "2021-03-11 16:58:56+01"
9 "2022-03-25 21:40:10+01"
10 "2022-04-04 02:12:29+02"
11 "2022-04-13 08:24:23+02"
12 "2022-05-08 06:48:25+02"
13 "2022-05-19 15:40:12+02"
14 "2022-06-01 11:29:36+02"
15 "2022-06-05 02:15:05+02"
16 "2022-06-05 03:08:22+02"
expected result
[
{
"year-month": "2021-01",
"number" : 2
},
{
"year-month": "2021-03",
"number" : 3
},
{
"year-month": "2021-03",
"number" : 3
},
{
"year-month": "2021-03",
"number" : 1
},
{
"year-month": "2021-04",
"number" : 2
},
{
"year-month": "2021-05",
"number" : 3
},
{
"year-month": "2021-06",
"number" : 3
},
]
I have done this but I am not able to order by date:
Orders.objects.annotate(month=TruncMonth('created_at')).values('month').annotate(number=Count('order_id')).values('month', 'number').order_by()
<SafeDeleteQueryset [
{'month': datetime.datetime(2022, 3, 1, 0, 0, tzinfo=<UTC>), 'number': 4},
{'month': datetime.datetime(2022, 6, 1, 0, 0, tzinfo=<UTC>), 'number': 2},
{'month': datetime.datetime(2022, 5, 1, 0, 0, tzinfo=<UTC>), 'number': 1},
{'month': datetime.datetime(2022, 1, 1, 0, 0, tzinfo=<UTC>), 'number': 5},
{'month': datetime.datetime(2021, 12, 1, 0, 0, tzinfo=<UTC>), 'number': 1},
{'month': datetime.datetime(2022, 7, 1, 0, 0, tzinfo=<UTC>), 'number': 1},
{'month': datetime.datetime(2021, 9, 1, 0, 0, tzinfo=<UTC>), 'number': 2},
'...(remaining elements truncated)...'
]>
Try adding the order_by on the original field if you have multi-year data.
from django.db.models import Sum
from django.db.models.functions import TruncMonth
Orders.objects.values(month=TruncMonth('created_at')).
order_by("created_at").annotate(Sum('number')

Django database query: How to filter objects by date to warn user if >= 8 hours?

I've got a field in one model like
class Project(models.Model):
date = fields.DateField(auto_now=False)
user = models.ManyToManyField(User, related_name="projects", blank=True)
work_times = models.FloatField(default=1, verbose_name="work times(hours)")
Now, I would like to filter a user's work_times(different value) by a date(same date), the purpose is to warn or remind users when they create new project, because user can't have new project if his projects' work_times >= 8 hours on a day, I already got values as below
[{'work_times': 1.0, 'date': datetime.date(2018, 6, 25)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}, {'work_times': 1.0, 'date': datetime.date(2018, 6, 28)}]
one user has several projects with different work_times on a date, how to filter user A's total work_times on a date in his all projects?
Thanks for your help!
Try to use annotation like this:
from django.db.models import Sum, Q
User.objects.annotate(total_times=Sum('projects__work_times', filter=Q(projects__date=some_date))).filter(total_times__gt=8)

Django count number of records per day

I'm using Django 2.0
I am preparing data to show on a graph in template. I want to fetch number of records per day.
This is what I'm doing
qs = self.get_queryset().\
extra({'date_created': "date(created)"}).\
values('date_created').\
annotate(item_count=Count('id'))
but, the output given is
[
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1}
]
Here data is not grouped and same date is returning repeatedly with count as 1
Try using TruncDate function.
See that answer