Django queryset: annotate with calculated value - django

I am making a very simple notification system for my website, powered by a Django REST Framework API. It's for sending website updates and things to all users, everyone gets the same notifications, and they can then mark it as read / archive it. I have come up with the following model:
class Notification(models.Model):
title = models.CharField(max_length=255)
text = models.TextField()
type = models.CharField(max_length=255, blank=True)
read_by = models.ManyToManyField(User, blank=True, related_name="read_notifications")
archived_by = models.ManyToManyField(User, blank=True, related_name="archived_notifications")
created_at = models.DateTimeField(auto_now_add=True, db_index=True)
updated_at = models.DateTimeField(auto_now=True)
So there is no receiver field or something like that, as all users get all notifications anyway.
Now I am trying to write the view logic, notably the following 2 things: only fetch non-archived notifications made after the user was created, and add a calculated "is_read" field to it, in a way that doesn't do extra queries for every single notification / user combination.
The query looks like this now:
queryset = Notification.objects
.order_by("-created_at")
.filter(created_at__gt=self.request.user.created_at)
.exclude(archived_by=self.request.user)
This does indeed filter out archived queries as expected, and I think it's doing it without an extra query for every notification:
SELECT "notifications_notification"."id", "notifications_notification"."title", "notifications_notification"."text", "notifications_notification"."type", "notifications_notification"."created_at", "notifications_notification"."updated_at" FROM "notifications_notification" WHERE ("notifications_notification"."created_at" > 2022-09-26 12:44:04.771961+00:00 AND NOT (EXISTS(SELECT 1 AS "a" FROM "notifications_notification_archived_by" U1 WHERE (U1."user_id" = 1 AND U1."notification_id" = ("notifications_notification"."id")) LIMIT 1))) ORDER BY "notifications_notification"."created_at" DESC
So far so good! But I still need to add an "is_read" value (or "is_unread" if easier) to the query somehow, which I am not able to work out how to do.
How can I finish the query and make it performant as well?

After trial and error I came up with this:
queryset = Notification.objects.order_by("-created_at")
.filter(created_at__gt=self.request.user.created_at)
.exclude(archived_by=self.request.user)
.annotate(is_read=Exists(Notification.objects.filter(pk=OuterRef("id"), read_by=self.request.user)))
And that works, although it does do 2 subqueries and I wonder if this is going to become a bottleneck later on?
SELECT "notifications_notification"."id", "notifications_notification"."title", "notifications_notification"."text", "notifications_notification"."type", "notifications_notification"."created_at", "notifications_notification"."updated_at", EXISTS(SELECT 1 AS "a" FROM "notifications_notification" U0 INNER JOIN "notifications_notification_read_by" U1 ON (U0."id" = U1."notification_id") WHERE (U0."id" = ("notifications_notification"."id") AND U1."user_id" = 1) LIMIT 1) AS "is_read" FROM "notifications_notification" WHERE ("notifications_notification"."created_at" > 2022-09-26 14:40:29.368043+00:00 AND NOT (EXISTS(SELECT 1 AS "a" FROM "notifications_notification_archived_by" U1 WHERE (U1."user_id" = 1 AND U1."notification_id" = ("notifications_notification"."id")) LIMIT 1))) ORDER BY "notifications_notification"."created_at" DESC

Related

Determine count of object retrieval per day in django

In a model like the one below
class Watched(Stamping):
user = models.ForeignKey("User", null=True, blank=True, on_delete=models.CASCADE)
count = models.PositiveIntegerField(default=0)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
Anytime an object is retrieved, I increment the count attribute.
Now my problem is how to get the number of times an object was retrieved for each day of the week
For example, WatchedObject1 will have {'Sun': 10, 'Tue': 70, 'Wed': 35}
This seems like a use case for auditing and there are plugins for Django that can help you with that. If you don't want to add this dependency you would have to create another model that you store your intended data.
class RetrievalOfData(models.Model):
date_of_retrieval = models.datetimefield(auto_now_add=True)
object_retrieved = models.ForeignKey("Watched")
You could probably also override the manager to create these objects everytime the model is queried: https://docs.djangoproject.com/en/3.2/topics/db/managers/
You might find it better to have a separate WatchedModelStats table, and perhaps link it you your model with Django signals. Whenever a countable event takes place, execute something like
try:
counter = WatchedModelStats.objects.get( name=model_name, date=today)
counter.count += 1
except WatchedModelStats.DoesNotExist:
counter = WatchedModelStats( name=model_name, date=today, count=1 )
counter.save()
One advantage is extensibility. You could easily implement multiple counts for differerent event types, if the need later becomes apparent.

How could you make this really reaaally complicated raw SQL query with django's ORM?

Good day, everyone. Hope you're doing well. I'm a Django newbie, trying to learn the basics of RESTful development while helping in a small app project. Currently, there's a really difficult query that I must do to create a calculated field that updates my student's status accordingly to the time interval the classes are in. First, let me explain the models:
class StudentReport(models.Model):
student = models.ForeignKey(Student, on_delete=models.CASCADE,)
headroom_teacher = models.ForeignKey(Teacher, on_delete=models.CASCADE,)
upload = models.ForeignKey(Upload, on_delete=models.CASCADE, related_name='reports', blank=True, null=True,)
exams_date = models.DateTimeField(null=True, blank=True)
#Other fields that don't matter
class ExamCycle(models.Model):
student = models.ForeignKey(student, on_delete=models.CASCADE,)
headroom_teacher = models.ForeignKey(Teacher, on_delete=models.CASCADE,)
#Other fields that don't matter
class RecommendedClasses(models.Model):
report = models.ForeignKey(Report, on_delete=models.CASCADE,)
range_start = models.DateField(null=True)
range_end = models.DateField(null=True)
# Other fields that don't matter
class StudentStatus(models.TextChoices):
enrolled = 'enrolled' #started class
anxious_for_exams = 'anxious_for_exams'
sticked_with_it = 'sticked_with_it' #already passed one cycle
So this app will help the management of a Cram school. We first do an initial report of the student and its best/worst subjects in StudentReport. Then a RecommendedClasses object is created that tells him which clases he should enroll in. Finally, we have a cycle of exams (let's say 4 times a year). After he completes each exam, another report is created and he can be recommended a new class or to move on the next level of its previous class.
I'll use the choices in StudentStatus to calculate an annotated field that I will call status on my RecommendedClasses report model. I'm having issues with the sticked_with_it status because it's a query that it's done after one cycle is completed and two reports have been made (Two because this query must be done in StudentStatus, after 2nd Report is created). A 'sticked_with_it' student has a report created after exams_date where RecommendedClasses was created and the future exams_date time value falls within the 30 days before range_start and 60 days after the range_end values of the recommendation (Don't question this, it's just the way the higherups want the status)
I have already come up with two ways to do it, but one is with a RAW SQL query and the other is waaay to complicated and slow. Here it is:
SELECT rec.id AS rec_id FROM
school_recommendedclasses rec LEFT JOIN
school_report original_report
ON rec.report_id = original_report.id
AND rec.teacher_id = original_report.teacher_id
JOIN reports_report2 future_report
ON future_report.exams_date > original_report.exams_date
AND future_report.student_id = original_report.student_id
AND future_report.`exams_date` > (rec.`range_start` - INTERVAL 30 DAY)
AND future_report.`exams_date` <
(rec.`range_end` + INTERVAL 60 DAY)
AND original_report.student_id = future_report.student_id
How can I transfer this to a proper DJANGO ORM that is not so painfully unoptimized? I'll show you the other way in the comments.
FWIW, I find this easier to read, but there's very little wrong with your query.
Transforming this to your ORM should be straightforward, and any further optimisations are down to indexes...
SELECT r.id rec_id
FROM reports_recommendation r
JOIN reports_report2 o
ON o.id = r.report_id
AND o.provider_id = r.provider_id
JOIN reports_report2 f
ON f.initial_exam_date > o.initial_exam_date
AND f.patient_id = o.patient_id
AND f.initial_exam_date > r.range_start - INTERVAL 30 DAY
AND f.initial_exam_date < r.range_end + INTERVAL 60 DAY
AND f.provider_id = o.provider_id

django querset filter foreign key select first record

I have a History model like below
class History(models.Model):
class Meta:
app_label = 'subscription'
ordering = ['-start_datetime']
subscription = models.ForeignKey(Subscription, related_name='history')
FREE = 'free'
Premium = 'premium'
SUBSCRIPTION_TYPE_CHOICES = ((FREE, 'Free'), (Premium, 'Premium'),)
name = models.CharField(max_length=32, choices=SUBSCRIPTION_TYPE_CHOICES, default=FREE)
start_datetime = models.DateTimeField(db_index=True)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
cancelled_datetime = models.DateTimeField(blank=True, null=True)
Now i have a queryset filtering like below
users = get_user_model().objects.all()
queryset = users.exclude(subscription__history__end_datetime__lt=timezone.now())
The issue is that in the exclude above it is checking end_datetime for all the rows for a particular history object. But i only want to compare it with first row of history object.
Below is how a particular history object looks like. So i want to write a queryset filter which can do datetime comparison on first row only.
You could use a Model Manager method for this. The documentation isn't all that descriptive, but you could do something along the lines of:
class SubscriptionManager(models.Manager):
def my_filter(self):
# You'd want to make this a smaller query most likely
subscriptions = Subscription.objects.all()
results = []
for subscription in subscriptions:
sub_history = subscription.history_set.first()
if sub_history.end_datetime > timezone.now:
results.append(subscription)
return results
class History(models.Model):
subscription = models.ForeignKey(Subscription)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
objects = SubscriptionManager()
Then: queryset = Subscription.objects().my_filter()
Not a copy-pastable answer, but shows the use of Managers. Given the specificity of what you're looking for, I don't think there's a way to get it just via the plain filter() and exclude().
Without knowing what your end goal here is, it's hard to say whether this is feasible, but have you considered adding a property to the subscription model that indicates whatever you're looking for? For example, if you're trying to get everyone who has a subscription that's ending:
class Subscription(models.Model):
#property
def ending(self):
if self.end_datetime > timezone.now:
return True
else:
return False
Then in your code: queryset = users.filter(subscription_ending=True)
I have tried django's all king of expressions(aggregate, query, conditional) but was unable to solve the problem so i went with RawSQL and it solved the problem.
I have used the below SQL to select the first row and then compare the end_datetime
SELECT (end_datetime > %s OR end_datetime IS NULL) AS result
FROM subscription_history
ORDER BY start_datetime DESC
LIMIT 1;
I will select my answer as accepted if not found a solution with queryset filter chaining in next 2 days.

Creating a query with foreign keys and grouping by some data in Django

I thought about my problem for days and i need a fresh view on this.
I am building a small application for a client for his deliveries.
# models.py - Clients app
class ClientPR(models.Model):
title = models.CharField(max_length=5,
choices=TITLE_LIST,
default='mr')
last_name = models.CharField(max_length=65)
first_name = models.CharField(max_length=65, verbose_name='Prénom')
frequency = WeekdayField(default=[]) # Return a CommaSeparatedIntegerField from 0 for Monday to 6 for Sunday...
[...]
# models.py - Delivery app
class Truck(models.Model):
name = models.CharField(max_length=40, verbose_name='Nom')
description = models.CharField(max_length=250, blank=True)
color = models.CharField(max_length=10,
choices=COLORS,
default='green',
unique=True,
verbose_name='Couleur Associée')
class Order(models.Model):
delivery = models.ForeignKey(OrderDelivery, verbose_name='Delivery')
client = models.ForeignKey(ClientPR)
order = models.PositiveSmallIntegerField()
class OrderDelivery(models.Model):
date = models.DateField(default=d.today())
truck = models.ForeignKey(Truck, verbose_name='Camion', unique_for_date="date")
So i was trying to get a query and i got this one :
ClientPR.objects.today().filter(order__delivery__date=date.today())
.order_by('order__delivery__truck', 'order__order')
But, i does not do what i really want.
I want to have a list of Client obj (query sets) group by truck and order by today's delivery order !
The thing is, i want to have EVERY clients for the day even if they are not in the delivery list and with filter, that cannot be it.
I can make a query with OrderDelivery model but i will only get the clients for the delivery, not all of them for the day...
Maybe i will need to do it with a Q object ? or even raw SQL ?
Maybe i have built my models relationships the wrong way ? Or i need to lower what i want to do... Well, for now, i need your help to see the problem with new eyes !
Thanks for those who will take some time to help me.
After some tests, i decided to go with 2 querys for one table.
One from OrderDelivery Queryset for getting a list of clients regroup by Trucks and another one from ClientPR Queryset for all the clients without a delivery set for them.
I that way, no problem !

Django: Distinct on forgin key relationship

I'm working on a Ticket/Issue-tracker in django where I need to log the status of each ticket. This is a simplification of my models.
class Ticket(models.Model):
assigned_to = ForeignKey(User)
comment = models.TextField(_('comment'), blank=True)
created = models.DateTimeField(_("created at"), auto_now_add=True)
class TicketStatus(models.Model):
STATUS_CHOICES = (
(10, _('Open'),),
(20, _('Other'),),
(30, _('Closed'),),
)
ticket = models.ForeignKey(Ticket, verbose_name=_('ticket'))
user = models.ForeignKey(User, verbose_name=_('user'))
status = models.IntegerField(_('status'), choices=STATUS_CHOICES)
date = models.DateTimeField(_("created at"), auto_now_add=True)
Now, getting the status of a ticket is easy sorting by date and retrieving the first column like this.
ticket = Ticket.objects.get(pk=1)
ticket.ticketstatus_set.order_by('-date')[0].get_status_display()
But then I also want to be able to filter on status in the Admin, and those have to get the status trough a Ticket-queryset, which makes it suddenly more complex. How would I get a queryset with all Tickets with a certain status?
I guess you are trying to avoid a cycle (asking for each ticket status) to filter manually the queryset. As far as I know you cannot avoid that cycle. Here are ideas:
# select_related avoids a lot of hits in the database when enter the cycle
t_status = TicketStatus.objects.select_related('Ticket').filter(status = ID_STATUS)
# this is an array with the result
ticket_array = [ts.ticket for ts in tickets_status]
Or, since you mention you were looking for a QuerySet, this might be what you are looking for
# select_related avoids a lot of hits in the database when enter the cycle
t_status = TicketStatus.objects.select_related('Ticket').filter(status = ID_STATUS)
# this is a QuerySet with the result
tickets = Tickets.objects.filter(pk__in = [ts.ticket.pk for ts in t_status])
However, the problem might be in the way you are modeling the data. What you called TickedStatus is more like TicketStatusLog because you want to keep track of the user and date who change the status.
Therefore, the reasonable approach is to add a field 'current_status' to the Ticket model that is updated each time a new TicketStatus is created. In this way (1) you don't have to order a table each time you ask for a ticket and (2) you would simply do something like Ticket.objects.filter(current_status = ID_STATUS) for what I think you are asking.