Django Query ManyToMany with Custom Through Table Field Data - django

I've been trying to figure this one out for a while now but am confused. Every ManyToMany relationship always goes through a third table which isn't that difficult to understand. But in the event that the third table is a custom through table with additional fields how do you grab the custom field for each row?
Here's a sample table I made. How can I get all the movies a User has watched along with the additional watched field and finished field? This example assumes the user is only allowed to see the movie once whether they finish it or not so there will only be 1 record for each movie they saw.
class Movie(models.Model):
title = models.CharField(max_length=191)
class User(models.Model):
username = models.CharField(max_length=191)
watched = models.ManyToMany(Movie, through='watch')
class Watch(models.Model):
user = models.Foreignkey(User, on_delete=models.CASCADE)
movie = models.Foreignkey(Movie, on_delete=models.CASCADE)
watched = models.DateTimeField()
finished = models.BooleanField()
Penny for your thoughts my friends.

You can uses:
from django.db.models import F
my_user.watched.annotate(
watched=F('watch__watched'),
finished=F('watch__finished')
)
This will return a QuerySet of Movies that contain as extra attributes .watched and .finished.
That being said, it might be cleaner to just access the watch_set, and thus iterate over the Watch objects and access the .movie object for details about the movie. You can use .select_related(..) [Django-doc] to fetch the information about the Movies in the same database query:
for watch in my_user.watch_set.select_related('movie'):
print(f'{watch.movie.title}: {watch.watched}, {watch.finished}')

Related

How to get the first record of a 1-N relationship from the main table with Django ORM?

I have a Users table which is FK to a table called Post. How can I get only the last Post that the user registered? The intention is to return a list of users with the last registered post, but when obtaining the users, if the user has 3 posts, the user is repeated 3 times. I'm interested in only having the user once. Is there an alternative that is not unique?
class User(models.Model):
name = models.CharField(max_length=50)
class Post(models.Model):
title = models.CharField(max_length=50)
user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='posts', related_query_name='posts')
created = models.DateTimeField(default=timezone.now)
class Meta:
get_latest_by = 'created'
ordering = ['-created']`
I already tried with selected_related and prefetch_related, I keep getting multiple user registrations when they have multiple Posts.
user = User.objects.select_related('posts').all().values_list('id', 'name', 'posts__title', 'posts__created')
This does give me the answer I want, but when I change the created field to sort by date, I don't get the newest record, I always get the oldest.
user = User.objects.select_related('posts').all().values_list('id', 'name', 'posts__title', 'posts__created').distinct('id')
I'm trying to do it without resorting to doing a record-by-record for and getting the most recent Post. I know that this is an alternative but I'm trying to find a way to do it directly with the Django ORM, since there are thousands of records and a for is less than optimal.
In that case your Django ORM query would first filter posts by user then order by created in descending order and get the first element of the queryset.
last_user_post = Post.objects.filter(user__id=1).order_by('-created').first()
Alternatively, you can use an user instance:
user = User.objects.get(id=1)
last_user_post = Post.objects.filter(user=user).order_by('-created').first()

Django project architecture advice

I have a django project and I have a Post model witch look like that:
class BasicPost(models.Model):
author = models.ForeignKey('auth.User', on_delete=models.CASCADE)
published = models.BooleanField(default=False)
created_date = models.DateTimeField(auto_now_add=True)
title = models.CharField(max_length=100, blank=False)
body = models.TextField(max_length=999)
media = models.ImageField(blank=True)
def get_absolute_url(self):
return reverse('basic_post', args=[str(self.pk)])
def __str__(self):
return self.title
Also, I use the basic User model that comes with the basic django app.
I want to save witch posts each user has read so I can send him posts he haven't read.
My question is what is the best way to do so, If I use Many to Many field, should I put it on the User model and save all the posts he read or should I do it in the other direction, put the Many to Many field in the Post model and save for each post witch user read it?
it's going to be more that 1 million + posts in the Post model and about 50,000 users and I want to do the best filters to return unread posts to the user
If I should use the first option, how do I expand the User model?
thanks!
On your first question (which way to go): I believe that ManyToMany by default creates indices in the DB for both foreign keys. Therefore, wherever you put the relation, in User or in BasicPost, you'll have the direct and reverse relationships working through an index. Django will create for you a pivot table with three columns like: (id, user_id, basic_post_id). Every access to this table will index through user_id or basic_post_id and check that there's a unique couple (user_id, basic_post_id), if any. So it's more within your application that you'll decide whether you filter from a 1 million set or from a 50k posts.
On your second question (how to overload User), it's generally recommended to subclass User from the very beginning. If that's too late and your project is too far advanced for that, you can do this in your models.py:
class BasicPost(models.Model):
# your code
readers = models.ManyToManyField(to='User', related_name="posts_already_read")
# "manually" add method to User class
def _unread_posts(user):
return BasicPost.objects.exclude(readers__in=user)
User.unread_posts = _unread_posts
Haven't run this code though! Hope this helps.
Could you have a separate ReadPost model instead of a potentially large m2m, which you could save when a user reads a post? That way you can just query the ReadPost models to get the data, instead of storing it all in the blog post.
Maybe something like this:
from django.utils import timezone
class UserReadPost(models.Model):
user = models.ForeignKey("auth.User", on_delete=models.CASCADE, related_name="read_posts")
seen_at = models.DateTimeField(default=timezone.now)
post = models.ForeignKey(BasicPost, on_delete=models.CASCADE, related_name="read_by_users")
You could add a unique_together constraint to make sure that only one UserReadPost object is created for each user and post (to make sure you don't count any twice), and use get_or_create() when creating new records.
Then finding the posts a user has read is:
posts = UserReadPost.objects.filter(user=current_user).values_list("post", flat=True)
This could also be extended relatively easily. For example, if your BasicPost objects can be edited, you could add an updated_at field to the post. Then you could compare the seen_at of the UserReadPost field to the updated_at field of the BasicPost to check if they've seen the updated version.
Downside is you'd be creating a lot of rows in the DB for this table.
If you place your posts in chronological order (by created_at, for example), your option could be to extend user model with latest_read_post_id field.
This case:
class BasicPost(models.Model):
# your code
def is_read_by(self, user):
return self.id < user.latest_read_post_id

Django Queryset Prefetch Optimization for iterating over nested results

I'm looking for a way to optimize a queryset result processing in Django by improving database access performance, taking into consideration that I need to fetch a nested relation.
Taking these models as example:
class Movie(models.Model):
name = models.CharField(max_length=50)
class Ticket(models.Model):
code = models.CharField(max_length=255, blank=True, unique=True)
movie = models.ForeignKey(Movie, related_name='tickets')
class Buyer(models.Model):
name = models.CharField(max_length=50)
class Purchase(models.Model):
tickets = models.ManyToManyField(Ticket, related_name='purchases')
buyer = models.ForeignKey(Buyer, related_name='purchases')
and this Movie QuerySet:
movies = Movie.objects.all().prefetch_related('tickets__purchases__buyer')
In case I need to retrieve all buyers from each Movie in the above QuerySet, this is one approach:
for movie in movies:
buyers = Buyer.objects.filter(purchases__tickets__in=movie.tickets.all()).distinct()
But that will hit the database once for each Movie iterated. So to this in a single transaction, I'm doing something like:
def get_movie_buyers(movie):
buyers = set()
for ticket in movie.tickets.all():
for purchase in ticket.purchases.all():
if purchase.buyer:
buyers.add(purchase.buyer)
return buyers
for movie in movies:
buyers = get_movie_buyers(movie)
This approach hits the database once due to the prefetch_related in the QuerySet, but it doesn't look optimal as I'm iterating over many nested loops, which will then increase application memory overload instead.
There might be a better approach that I couldn't figure out yet, looking for some guidance.
UPDATE
alasdair suggested to use Prefetch object, tried that:
movies = Movie.objects.prefetch_related(
Prefetch(lookup='tickets__purchases__buyer',
to_attr='buyers')
).all()
for movie in movies:
print movie.buyers
But this gives me the following error:
'Movie' object has no attribute 'buyers'
The reason why it seems too difficult is the ManyToMany relation between Purchase and Tickets.
This relation allows the same ticket to be present in multiple purchases. But this will not be the case in actual data as one ticket can be purchased only once.
The query can be simplified if you remove this ManyToMany field and add a ForeignKey field in Ticket to purchase
class Ticket(models.Model):
code = models.CharField(max_length=255, blank=True, unique=True)
movie = models.ForeignKey(Movie, related_name='tickets')
purchase = models.ForeignKey(Purchase, null=True, blank=True)
Then the query can be simplified as below.
movies = Movie.objects.all().prefetch_related('tickets__purchase__buyer')
for movie in movies:
print(set(ticket.purchase.buyer for ticket in movie.tickets if ticket.purchase))
Sure this will create additional complexity while creating Purchases as you need to update the ticket objects with purchase_id.
You need to make a call on where to keep complexity based on the frequency of both actions

How to retrieve a set of objects, filtered and ordered by fields of other objects, for which the desired object is a foreign key?

To rephrase the title to the context of my problem: How to retrieve a set of foods, filtered and ordered by fields of other objects, for which the food object is a foreign key?
I have the following models:
class Food(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE)
name = models.CharField(max_length=200)
description = models.CharField(max_length=200, blank=True)
class DayOfFood(models.Model):
user = models.ForeignKey(User)
date = models.DateField()
unique_together = ("user", "date")
class FoodEaten(models.Model):
day = models.ForeignKey(DayOfFood, on_delete=models.CASCADE)
food = models.ForeignKey(Food, on_delete=models.CASCADE)
servings = models.FloatField(default=1)
I want to be able to retrieve the foods that a given user ate most recently. This collection of foods will be passed to a template so it must be a QuerySet to allow the template to loop over the food objects.
This is how far I got
days = DayOfFood.objects.filter(user=request.user)
foodeatens = FoodEaten.objects.filter(day__in=days)
foodeatens = foodeatens.order_by('day__date')
Now it feels like I am almost there, all the foods I want are contained in the FoodEaten objects in the resulting QuerySet. I do not know how to use "for ... in:" to get the food objects and still have them stored in a QuerySet. Is there a way to perform foreach or map to a QuerySet?
I do not want to rewrite the template to accept FoodEaten objects instead because the template is used by other views which do simply pass food objects to the template.
The solution
The answer from Shang Wang helped me write code that solves my problem:
days = DayOfFood.objects.filter(user=request.user)
foods = Food.objects.filter(
foodeaten__day__in=days,
foodeaten__day__user=request.user) \
.order_by('-foodeaten__day__date')
That could be done using chain of relations:
Food.objects.filter(foodeaten__day__in=days,
foodeaten__day__user=request.user) \
.order_by('foodeaten__day__date')
By the way, I'm not sure why do you have user on multiple models Food and DayOfFood. If you really need user relations on both of them, maybe make the field name more explicit with the user's role in each model, otherwise you will get confused very quickly.

Django: distinct QuerySet based on a related field

In my Django app I allow users to create collections of movies by category. This is represented using 3 models, Movie, Collection, and Addition (the Addition model stores movie, collection, and user instances). Simplified versions of all three models are below.
class Movie(models.Model):
name = models.CharField(max_length=64)
class Collection(models.Model):
name = models.CharField(max_length=64)
user = models.ForeignKey(User)
class Addition(models.Model):
user = models.ForeignKey(User)
movie = models.ForeignKey(Movie)
collection = models.ForeignKey(Collection)
So for example a user could create a collection called "80's movies", and add the movie "Indiana Jones" to their collection.
My question is: how do I display a distinct list of movies based on a set of query filters? Right now I am getting a bunch of duplicates for those movies that have been added to more than one collection. I would normally use distinct() to get distinct objects, but in this case I need distinct movies rather than distinct additions, but I need to query the Addition model because I want to allow the user to view movies added by their friends.
Am I setting up my models in an optimal way? Any advice/help would be much appreciated.
Thanks!
First. I don't think you need Addition model here. You try to create many-to-many relation, but there's documented way of doing this:
class Movie(models.Model):
name = models.CharField(max_length=64)
class Collection(models.Model):
name = models.CharField(max_length=64)
user = models.ForeignKey(User)
movies = models.ManyToManyField('Movie', blank=True, null=True)
Second. The documentation says: "To refer to a "reverse" relationship, just use the lowercase name of the model".
So the answer is (for the setup above):
Movie.objects.filter(collection__user=user).distinct()