I am struggling with a pretty complex annotation of a QuerySet and I would really appreciate some help.
Here are my models:-
class Player(models.Model):
group = models.ForeignKey(Group)
class Transaction(models.Model):
created = models.DateTimeField()
amount = models.DecimalField(decimal_places=2, max_digits=10)
player = models.ForeignKey(Player)
Given a particular Group, I can get all the transactions for that Group using:-
Transaction.objects.filter(player__group=group)
But what I need is for each of those transactions to be annotated with the overall balance of the group at the time the transaction was created. So for each transaction, I need to sum all the transactions of the group whose created time was earlier than the transaction's created date.
To do this, without ending up with tons of database calls, (I think) requires a complex queryset using things like Subquery and OuterRef but I can't quite figure out the exact logic.
I tried something like this:-
balance = queryset.annotate(
balance=Sum(
Case(
When(
date__lte=F("date"),
then=F("amount"),
),
default=0,
output_field=DecimalField(),
)
)
).filter(pk=OuterRef("pk"))
queryset.annotate(
group_balance=Subquery(balance.values("balance"), output_field=DecimalField())
)
but I know that's not quite right. I feel like I'm close but it's driving me mad.
Related
class Subject(models.Model):
...
students = models.ManyToMany('Student')
type = models.CharField(max_length=100)
class Student(models.Model):
class = models.IntergerField()
dropped = models.BooleanField()
...
subjects_with_dropouts = (
Subject.objects.filter(category=Subject.STEM).
prefetch_related(
Prefetch('students', queryset=Students.objects.filter(class=2020))
.annotate(dropped_out=Case(
When(
students__dropped=True,
then=True,
),
output_field=BooleanField(),
default=False,
))
.filter(dropped_out=True)
)
I am trying to get all Subjects from category STEM, that have dropouts of class 2020, but for some reason I get Subjects that have dropouts from other classes as well.
I know that I can achive with
subjects_with_dropouts = Subject.objects.filter(
category=Subject.STEM,
students__dropped=True,
students__class=2020,
)
But why 1st approach doesn't work? I am using PostgreSQL.
When using prefetch, the joining is done in python. A good way to think of this is that you have two tables in the first query. One of subjects with at least one student who dropped out (note that you are doing an aggregate there (Case) so there is a JOIN with a GROUP BY on student.id), and one of students in class of 2020 (this is separate than the join in the first table). The prefetch just says to join these two separate queries using the through table that contains both of their ids representing a connection that is auto generated by ManyToManyField.
A good way to see what is actually happening is by using print(QuerySet.query) where QuerySet is the instance of the QuerySet (Subject.objects.all()). Or if you have the means, django debug toolbar is a fantastic tool that shows you the EXPLAIN statement of each query in each endpoint.
I'm trying to optimize the fired queries of an API. I have four models namely User, Content, Rating, and UserRating with some relations to each other. I want the respective API returns all of the existing contents alongside their rating count as well as the score given by a specific user to that.
I used to do something like this: Content.objects.all() as a queryset, but I realized that in the case of having a huge amount of data tons of queries will be fired. So I've done some efforts to optimize the fired queries using select_related() and prefetch_related(). However, I'm dealing with an extra python searching, that I hope to remove that, using a controlled prefetch_related() — applying a filter just for a specific prefetch in a nested prefetch and select.
Here are my models:
from django.db import models
from django.conf import settings
class Content(models.Model):
title = models.CharField(max_length=50)
class Rating(models.Model):
count = models.PositiveBigIntegerField(default=0)
content = models.OneToOneField(Content, on_delete=models.CASCADE)
class UserRating(models.Model):
user = models.ForeignKey(
settings.AUTH_USER_MODEL, blank=True, null=True, on_delete=models.CASCADE
)
score = models.PositiveSmallIntegerField()
rating = models.ForeignKey(
Rating, related_name="user_ratings", on_delete=models.CASCADE
)
class Meta:
unique_together = ["user", "rating"]
Here's what I've done so far:
contents = (
Content.objects.select_related("rating")
.prefetch_related("rating__user_ratings")
.prefetch_related("rating__user_ratings__user")
)
for c in contents: # serializer like
user_rating = c.rating.user_ratings.all()
for u in user_rating: # how to remove this dummy search?
if u.user_id == 1:
print(u.score)
Queries:
(1) SELECT "bitpin_content"."id", "bitpin_content"."title", "bitpin_rating"."id", "bitpin_rating"."count", "bitpin_rating"."content_id" FROM "bitpin_content" LEFT OUTER JOIN "bitpin_rating" ON ("bitpin_content"."id" = "bitpin_rating"."content_id"); args=(); alias=default
(2) SELECT "bitpin_userrating"."id", "bitpin_userrating"."user_id", "bitpin_userrating"."score", "bitpin_userrating"."rating_id" FROM "bitpin_userrating" WHERE "bitpin_userrating"."rating_id" IN (1, 2); args=(1, 2); alias=default
(3) SELECT "users_user"."id", "users_user"."password", "users_user"."last_login", "users_user"."is_superuser", "users_user"."first_name", "users_user"."last_name", "users_user"."email", "users_user"."is_staff", "users_user"."is_active", "users_user"."date_joined", "users_user"."user_name" FROM "users_user" WHERE "users_user"."id" IN (1, 4); args=(1, 4); alias=default
As you can see on the above fired queries I've only three queries rather than too many queries which were happening in the past. However, I guess I can remove the python searching (the second for loop) using a filter on my latest query — users_user"."id" IN (1,) instead. According to this post and my efforts, I couldn't apply a .filter(rating__user_ratings__user_id=1) on the third query. Actually, I couldn't match my problem using Prefetch(..., queryset=...) instance given in this answer.
I think you are looking for Prefetch object:
https://docs.djangoproject.com/en/4.0/ref/models/querysets/#prefetch-objects
Try this:
from django.db.models import Prefetch
contents = Content.objects.select_related("rating").prefetch_related(
Prefetch(
"rating__user_ratings",
queryset=UserRating.objects.filter(user__id=1),
to_attr="user_rating_number_1",
)
)
for c in contents: # serializer like
print(c.rating.user_rating_number_1[0].score)
I am trying to query a list of rooms. There can be many participants in the rooms. I want to exclude rooms where there are participants a user blocked or got blocked from. The below code works if participants are simple ForeignKey(User) fields. However, the thing is participants are ManyToManyField field. It is written like
participants = models.ManyToManyField(
settings.AUTH_USER_MODEL, related_name='participants'
)
And the below code does not work.
get_blocked = Exists(
users_models.Relationship.objects.filter(
from_user=OuterRef('participants'),
to_user=info.context.user,
status='BLOCK',
)
)
blocking = Exists(
users_models.Relationship.objects.filter(
from_user=info.context.user,
to_user=OuterRef('participants'),
status='BLOCK',
)
)
models.Room.objects.annotate(
get_blocked=get_blocked, blocking=blocking
).filter(
participants=info.context.user, get_blocked=False, blocking=False
).distinct()
Again, the same logic works when the participants are simple ForeignKey(User). I am wondering if there is any way to resolve this issue.
I've read the documentation and looked at other questions posted here, but I can't find or figure out whether this is possible in Django.
I have a model relating actors and movies:
class Role(models.Model):
title_id = models.CharField('Title ID', max_length=20, db_index=True)
name_id = models.CharField('Name ID', max_length=20, db_index=True)
role = models.CharField('Role', max_length=300, default='?')
This is a single table that has pairs of actors and movies, so given a movie (title_id), there's a row for each actor in that movie. Similarly, given an actor (name_id), there's a row for every movie that actor was in.
I need to execute a query to return the list of all title_id's that are related to a given title_id by a common actor. The SQL for this query looks like this:
SELECT DISTINCT r2.title_id
FROM role as r1, role as r2
WHERE r1.name_id = r2.name_id
AND r1.title_id != r2.title_id
AND r1.title_id = <given title_id>
Can something like this be expressed in a single Django ORM query, or am I forced to use two queries with some intervening code? (Or raw SQL?)
Normally I would break this into Actor and Movie table to make it easier to query, but your requirement is there so I will give it a go
def get_related_titles(title_id)
all_actors = Role.objects.filter(title_id=title_id).values_list('pk', flat=True)
return Role.objects.filter(pk__in=all_actors).exclude(title_id=title_id) # maybe u need .distinct() here
this should give you one query, validate it this way:
print(get_related_titles(some_title_id).query)
I'm working on a Ticket/Issue-tracker in django where I need to log the status of each ticket. This is a simplification of my models.
class Ticket(models.Model):
assigned_to = ForeignKey(User)
comment = models.TextField(_('comment'), blank=True)
created = models.DateTimeField(_("created at"), auto_now_add=True)
class TicketStatus(models.Model):
STATUS_CHOICES = (
(10, _('Open'),),
(20, _('Other'),),
(30, _('Closed'),),
)
ticket = models.ForeignKey(Ticket, verbose_name=_('ticket'))
user = models.ForeignKey(User, verbose_name=_('user'))
status = models.IntegerField(_('status'), choices=STATUS_CHOICES)
date = models.DateTimeField(_("created at"), auto_now_add=True)
Now, getting the status of a ticket is easy sorting by date and retrieving the first column like this.
ticket = Ticket.objects.get(pk=1)
ticket.ticketstatus_set.order_by('-date')[0].get_status_display()
But then I also want to be able to filter on status in the Admin, and those have to get the status trough a Ticket-queryset, which makes it suddenly more complex. How would I get a queryset with all Tickets with a certain status?
I guess you are trying to avoid a cycle (asking for each ticket status) to filter manually the queryset. As far as I know you cannot avoid that cycle. Here are ideas:
# select_related avoids a lot of hits in the database when enter the cycle
t_status = TicketStatus.objects.select_related('Ticket').filter(status = ID_STATUS)
# this is an array with the result
ticket_array = [ts.ticket for ts in tickets_status]
Or, since you mention you were looking for a QuerySet, this might be what you are looking for
# select_related avoids a lot of hits in the database when enter the cycle
t_status = TicketStatus.objects.select_related('Ticket').filter(status = ID_STATUS)
# this is a QuerySet with the result
tickets = Tickets.objects.filter(pk__in = [ts.ticket.pk for ts in t_status])
However, the problem might be in the way you are modeling the data. What you called TickedStatus is more like TicketStatusLog because you want to keep track of the user and date who change the status.
Therefore, the reasonable approach is to add a field 'current_status' to the Ticket model that is updated each time a new TicketStatus is created. In this way (1) you don't have to order a table each time you ask for a ticket and (2) you would simply do something like Ticket.objects.filter(current_status = ID_STATUS) for what I think you are asking.