Django Queryset Prefetch Optimization for iterating over nested results - django

I'm looking for a way to optimize a queryset result processing in Django by improving database access performance, taking into consideration that I need to fetch a nested relation.
Taking these models as example:
class Movie(models.Model):
name = models.CharField(max_length=50)
class Ticket(models.Model):
code = models.CharField(max_length=255, blank=True, unique=True)
movie = models.ForeignKey(Movie, related_name='tickets')
class Buyer(models.Model):
name = models.CharField(max_length=50)
class Purchase(models.Model):
tickets = models.ManyToManyField(Ticket, related_name='purchases')
buyer = models.ForeignKey(Buyer, related_name='purchases')
and this Movie QuerySet:
movies = Movie.objects.all().prefetch_related('tickets__purchases__buyer')
In case I need to retrieve all buyers from each Movie in the above QuerySet, this is one approach:
for movie in movies:
buyers = Buyer.objects.filter(purchases__tickets__in=movie.tickets.all()).distinct()
But that will hit the database once for each Movie iterated. So to this in a single transaction, I'm doing something like:
def get_movie_buyers(movie):
buyers = set()
for ticket in movie.tickets.all():
for purchase in ticket.purchases.all():
if purchase.buyer:
buyers.add(purchase.buyer)
return buyers
for movie in movies:
buyers = get_movie_buyers(movie)
This approach hits the database once due to the prefetch_related in the QuerySet, but it doesn't look optimal as I'm iterating over many nested loops, which will then increase application memory overload instead.
There might be a better approach that I couldn't figure out yet, looking for some guidance.
UPDATE
alasdair suggested to use Prefetch object, tried that:
movies = Movie.objects.prefetch_related(
Prefetch(lookup='tickets__purchases__buyer',
to_attr='buyers')
).all()
for movie in movies:
print movie.buyers
But this gives me the following error:
'Movie' object has no attribute 'buyers'

The reason why it seems too difficult is the ManyToMany relation between Purchase and Tickets.
This relation allows the same ticket to be present in multiple purchases. But this will not be the case in actual data as one ticket can be purchased only once.
The query can be simplified if you remove this ManyToMany field and add a ForeignKey field in Ticket to purchase
class Ticket(models.Model):
code = models.CharField(max_length=255, blank=True, unique=True)
movie = models.ForeignKey(Movie, related_name='tickets')
purchase = models.ForeignKey(Purchase, null=True, blank=True)
Then the query can be simplified as below.
movies = Movie.objects.all().prefetch_related('tickets__purchase__buyer')
for movie in movies:
print(set(ticket.purchase.buyer for ticket in movie.tickets if ticket.purchase))
Sure this will create additional complexity while creating Purchases as you need to update the ticket objects with purchase_id.
You need to make a call on where to keep complexity based on the frequency of both actions

Related

Django Query ManyToMany with Custom Through Table Field Data

I've been trying to figure this one out for a while now but am confused. Every ManyToMany relationship always goes through a third table which isn't that difficult to understand. But in the event that the third table is a custom through table with additional fields how do you grab the custom field for each row?
Here's a sample table I made. How can I get all the movies a User has watched along with the additional watched field and finished field? This example assumes the user is only allowed to see the movie once whether they finish it or not so there will only be 1 record for each movie they saw.
class Movie(models.Model):
title = models.CharField(max_length=191)
class User(models.Model):
username = models.CharField(max_length=191)
watched = models.ManyToMany(Movie, through='watch')
class Watch(models.Model):
user = models.Foreignkey(User, on_delete=models.CASCADE)
movie = models.Foreignkey(Movie, on_delete=models.CASCADE)
watched = models.DateTimeField()
finished = models.BooleanField()
Penny for your thoughts my friends.
You can uses:
from django.db.models import F
my_user.watched.annotate(
watched=F('watch__watched'),
finished=F('watch__finished')
)
This will return a QuerySet of Movies that contain as extra attributes .watched and .finished.
That being said, it might be cleaner to just access the watch_set, and thus iterate over the Watch objects and access the .movie object for details about the movie. You can use .select_related(..) [Django-doc] to fetch the information about the Movies in the same database query:
for watch in my_user.watch_set.select_related('movie'):
print(f'{watch.movie.title}: {watch.watched}, {watch.finished}')

django query with prefetch_related

I am struggling to get at the data I need from a prefetch-related query.
I have a table of events (the calendar table), a table of members and an attendee table which links the two.
My models look like:
class Member(models.Model):
firstname = models.CharField(max_length=40)
lastname = models.CharField(max_length=50)
email = models.EmailField(blank=True, verbose_name ='e-mail')
phone = models.CharField(max_length=40)
membershipnum = models.CharField(max_length=40)
class Attendee(models.Model):
memberid = models.ForeignKey(Member, on_delete=models.SET(0), related_name="attendingmembers")
calendarid = models.ForeignKey(Calendar, on_delete=models.SET(0))
attended = models.BooleanField(default=0)
paid = models.BooleanField(default=0)
class Meta:
db_table = 'attendee'
For a particular event I want a list of attending members with the attended and paid fields from the attendee table.
In my view I have
attendees = Member.objects.filter(attendingmembers__calendarid_id=id).prefetch_related('attendingmembers')
I am getting the right members, but I don't know if this is the best way to do it? And I can't figure out how to get at the attendee fields.
If I do
for thisone in attendees:
print(thisone)
print(thisone.attendingmembers)
I get the expected return from the first print, but the second just gives me
myapp.Attendee.None
Any advice much appreciated.
You still need .all() to get the list of items of your relation:
for this one in attendees:
print(thisone.attendingmembers.all())
Btw, do you wish to get all attendingmembers or only the ones with the right calendar_id ?
attendees = Member.objects.filter(attendingmembers__calendar_id=id).prefetch_related('attendingmembers')
# return all Members having at least one attendingmember with calendar_id=id, and prefetch all of their attendingmembers
attendees = Member.objects.filter(attendingmembers__calendar_id=id).prefetch_related(Prefetch('attendingmembers', queryset=Attendee.objects.filter(calendar_id=id)))
# return all Members having at least one attendingmember with calendar_id=id, and prefetch their attendingmembers matching the filter
The documentation shows you how to use the to_attr argument in the Prefetch objects ;)

How to retrieve a set of objects, filtered and ordered by fields of other objects, for which the desired object is a foreign key?

To rephrase the title to the context of my problem: How to retrieve a set of foods, filtered and ordered by fields of other objects, for which the food object is a foreign key?
I have the following models:
class Food(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
name = models.CharField(max_length=200)
description = models.CharField(max_length=200, blank=True)
class DayOfFood(models.Model):
user = models.ForeignKey(User)
date = models.DateField()
unique_together = ("user", "date")
class FoodEaten(models.Model):
day = models.ForeignKey(DayOfFood, on_delete=models.CASCADE)
food = models.ForeignKey(Food, on_delete=models.CASCADE)
servings = models.FloatField(default=1)
I want to be able to retrieve the foods that a given user ate most recently. This collection of foods will be passed to a template so it must be a QuerySet to allow the template to loop over the food objects.
This is how far I got
days = DayOfFood.objects.filter(user=request.user)
foodeatens = FoodEaten.objects.filter(day__in=days)
foodeatens = foodeatens.order_by('day__date')
Now it feels like I am almost there, all the foods I want are contained in the FoodEaten objects in the resulting QuerySet. I do not know how to use "for ... in:" to get the food objects and still have them stored in a QuerySet. Is there a way to perform foreach or map to a QuerySet?
I do not want to rewrite the template to accept FoodEaten objects instead because the template is used by other views which do simply pass food objects to the template.
The solution
The answer from Shang Wang helped me write code that solves my problem:
days = DayOfFood.objects.filter(user=request.user)
foods = Food.objects.filter(
foodeaten__day__in=days,
foodeaten__day__user=request.user) \
.order_by('-foodeaten__day__date')
That could be done using chain of relations:
Food.objects.filter(foodeaten__day__in=days,
foodeaten__day__user=request.user) \
.order_by('foodeaten__day__date')
By the way, I'm not sure why do you have user on multiple models Food and DayOfFood. If you really need user relations on both of them, maybe make the field name more explicit with the user's role in each model, otherwise you will get confused very quickly.

django model design - ManyToMany or ForeignKey

I am confused to take a decision whether to use ForeignKey or ManyToManyField.
Suppose I am building an application for an event which demands tickets to get access the event and delegates may get some coupon based on the category of the ticket they have taken. I might have the following classes:
class Coupon(models.Model):
name = models.CharField()
event = models.ForeignKey(Event)
created_by = models.ForeignKey(User)
expired_time = models.DateTimeField()
description = models.TextField()
created_at = models.DateTimeField()
class CouponTicketMap(models.Model):
coupon = models.ForeignKey(Coupon)
tickets = models.ManyToManyField(Ticket)
class CouponUserMap(models.Model):
coupon = models.ForeignKey(Coupon)
users = models.ManyToManyField(User)
Organizer can map coupons to one or more tickets.
Or/And he can map to some selected or random users.
(I do not need an extra field in the intermediate table that is why I did not use through here.)
I can redesign the 2nd and 3rd model as
class CouponTicketMap(models.Model):
coupon = models.ForeignKey(Coupon)
tickets = models.ForeignKey(Ticket)
class CouponUserMap(models.Model):
coupon = models.ForeignKey(Coupon)
users = models.ForeignKey(User)
I think I can achieve what I need from both design, but want get know about the consequences of both design. So which design will get more votes when considering aspects such as performance, storage, conventional style etc. Can anybody shed some light on making a decision.
Thanks
I´ll say this model due to what you say:
class CouponTicketMap(models.Model):
coupon = models.ForeignKey(Coupon)
tickets = models.ForeignKey(Ticket)
class CouponUserMap(models.Model):
coupon = models.ForeignKey(Coupon)
users = models.ManyToManyField(User)
Cuz, one coupone can have many tickets, and many users can have a related same coupon. Dont see neccesary to stick just to one parameter, when you can use them both depending of the designed needed. Hope my opinion helps.

Manytomany django using existing key

I hope this is not a duplicate question. I am trying to setup models in django.
In model 1 I have one kind items (parts), these can together form item type 2 (car).
I get the prices for all of these from outside interface to a model prices.
How can I setup the relationship between price - > part and price - > car.
I do not know when I get the prices if the ident belongs to car och part.
class parts(models.Model):
ident = models.CharField("IDENT", max_length = 12, unique = True, primary_key = True)
name = models.CharField(max_length=30)
class car(models.Model):
ident = models.CharField("IDENT", max_length = 12, unique = True)
start_date = models.DateField()
end_date = models.DateField()
parts= models.ManyToManyField(parts)
class Prices(models.Model):
ident= models.CharField(max_length=12)
price = models.DecimalField(max_digits=10, decimal_places= 4)
date = models.DateField()
def __unicode__(self):
return self.ident
class Meta:
unique_together = (("ident", "date"),)
I would imagine you would not store price in your model since you need this to be 100% real time. So you have;
car models.py
from parts.models import parts
name = models.CharField(max_length=100)
parts = models.ManyToManyField(parts)
Hopefully you're not trying to develop like a full scale autozone type deal, but if it's simply a car model object that is comprised of many parts than this is the basic setup you would want. having the many to many relationship to parts allows one car to have many parts. parts can belong to many cars. You don't have to specify a manytomany relationship in the parts model as the two way communication will already be handled in your cars model.
As far as price is concerned you could have a price database field in your parts model, but once again if this needs to be real time, you probably want to request that price via an api and display it directly in your webpage.