Efficient query with Generic Relations - django

These are my models:
class Comment(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField(_('object ID'))
content_object = generic.GenericForeignKey()
user = models.ForeignKey(User)
comment = models.TextField(_('comment'))
class Post(models.Model):
title = models.CharField(_('name'), max_length=80)
creator = models.ForeignKey(User, related_name="created_posts")
created = models.DateTimeField(_('created'), default=datetime.now)
body = models.TextField(_('body'), null=True, blank=True)
Now in my views.py I get a post with this istruction:
post = get_object_or_404(Post, id=id)
At this point in my views.py, what is the most efficient query ( with ORM ) to get all comments of that post ?

You should define comments = generic.GenericRelation(Comment) on the Post, to give you easy access from Post to Comment. Once you've done that, it's a simple backwards relationship:
comments = post.comments.all()
Note that this isn't really a question of efficiency. Getting all the related items via a backwards generic relationship will always incur at most two queries - one to get the relevant ContentType, which is automatically cached on first look-up, and once to get the actual items.
If you had asked how to get all the comments for multiple posts as efficiently as possible, I'd point you to my blog for a good technique, but since you haven't I won't because that would just be blog-whoring.

Related

Getting comments for a post by certain users in Django using Postgre

I have two models Post and Comment. My users are able to build friendships with one another. I want to write a method that from a certain queryset of posts if the user's friends have commented on that post it will by consuming as less memory as possible.
My models are:
class Post(models.Model):
title = models.CharField(max_length=100)
content = models.TextField(validators=[MaxLengthValidator(1200)])
author = models.ForeignKey(Profile, on_delete=models.CASCADE)
date_posted = models.DateTimeField(auto_now_add=True)
class Comment(models.Model):
post = models.ForeignKey(Post,on_delete=models.CASCADE,related_name='comments')
author = models.ForeignKey(Profile, on_delete=models.CASCADE)
body = models.TextField(validators=[MaxLengthValidator(350)])
created_on = models.DateTimeField(auto_now_add=True)
now assuming I have a list of user id's which are friends with request.user
By: friends = Friend.objects.friends(request.user)
and I have a queryset of posts:
posts = Post.objects.select_related("author").filter(author__in=friends).order_by('-date_posted')
I can get the comment id of each post by: post.comments.values('id').select_related('author').filter(author__in=friends).order_by('-created-on')[:2]
However, if I want to do this for hundreds of post it will be extremely expensive. How can I get the last two comments by a users friends from each post as fast as possible by using the least memory?

Excluding objects from Django queryset based on recency

I have a reddit-like Django app where users can post interesting urls (links) and then publicly comment under them. The two data models to represent this are:
class Link(models.Model):
description = models.TextField(validators=[MaxLengthValidator(500)])
submitter = models.ForeignKey(User)
submitted_on = models.DateTimeField(auto_now_add=True)
class Publicreply(models.Model):
submitted_by = models.ForeignKey(User)
answer_to = models.ForeignKey(Link)
submitted_on = models.DateTimeField(auto_now_add=True)
description = models.TextField(validators=[MaxLengthValidator(250)])
How do I query for all Links which have at least 1 or more publicreply, and secondly where the latest publicreply is not by self.request.user? I sense something like the following:
Link.objects.filter(publicreply__isnull=False).exclude(**something here**)
Please advise! Performance is key too, hence the simpler the better!
For performance and simplicity you could cache both the number of replies and the latest reply:
class Link(models.Model):
...
number_of_replies = models.PositiveIntegerField(default=0)
latest_reply = models.ForeignKey('myapp.Publicreply', related_name='+', blank=True, null=True, on_delete=models.SET_NULL)
When a reply is entered, update the corresponding link.number_of_replies and link.latest_reply.
The query would then be:
Link.objects.filter(number_of_replies__gte=1)\
.exclude(latest_reply__user=request.user)

Django Queryset of related objects, after prefiltering on original model

Given a queryset for one model, I want to get a queryset of another model that is related by foreign key. Take the Django project docs' weblog schema:
class Blog(models.Model):
name = models.CharField(max_length=100)
tagline = models.TextField()
def __unicode__(self):
return self.name
class Author(models.Model):
name = models.CharField(max_length=50)
email = models.EmailField()
def __unicode__(self):
return self.name
class Entry(models.Model):
blog = models.ForeignKey(Blog)
headline = models.CharField(max_length=255)
body_text = models.TextField()
pub_date = models.DateField()
mod_date = models.DateField()
authors = models.ManyToManyField(Author)
n_comments = models.IntegerField()
n_pingbacks = models.IntegerField()
rating = models.IntegerField()
def __unicode__(self):
return self.headline
Suppose I have an author object, and I want to get every blog that author has written for, as a queryset. I do something like author_blogs = [entry.blog for entry in author.entry_set]. But I'm left with a list in this case, not a queryset. Is there a way I can do this directly with ORM queries, so I can set it up via a custom Entry manager with use_for_related_fields = True and do something like author_blogs = author.entry_set.blogs, and get the benefits of delayed evaluation, etc., of a queryset?
Edited scenario and solution
So, I realized after the fact that the application of my question is slightly different than how I posed it above, for which Daniel Roseman's situation makes a lot of sense. My situation is really more like author.entry_set.manager_method().blogs, where manager_method() returns a queryset of Entry objects. I accepted his answer because it inspired the solution I found which is to do:
author_blogs = Blog.objects.filter(entry__in=author.entry_set.manager_method())
The nice thing is that it only uses one DB query. It's a bit tricky and verbose, so I think it's best to define blogs() as an object method of Author, returning the above.
The trick for this is to remember that if you want a queryset of Blogs, you should start with the Blog model. Then, you can use the double-underscore syntax to follow relations. So:
author_blogs = Blog.objects.filter(entry__authors=author)

Django generic relation problem

I'm having problems coming up with a filter in one of my views. I'm creating a site with blog entries, news articles, and reviews. The entries and articles have generic relations with the reviews, because the reviews can tag either of them. What I'm trying to do is to sort the entries/articles based on the sum of the ratings of reviews newer than a certain date.
Here are the simplified models:
class Entry(models.Model):
name = models.CharField(max_length=50)
reviews = generic.GenericRelation(Review)
class Article(models.Model):
name = models.CharField(max_length=50)
reviews = generic.GenericRelation(Review)
class Review(models.Model):
rating = models.IntegerField()
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
target = generic.GenericForeignKey('content_type', 'object_id')
timestamp = models.DateTimeField(auto_now_add=True)
So, given that I needed to find a sum, I tried using annotate and aggregate, but I ran into two problems. The first one is that apparently generic relations and annotations don't work nicely together: https://code.djangoproject.com/ticket/10461. The second issue is that I don't think it's possible to only sum part of the reviews (in this case, with timestamp__gte=datetime.now()). Am I doing this the wrong way?
I also thought about doing this the other way around:
Review.filter(timestamp__gte=datetime.now(), target__in=something).aggregate(Sum('rating'))
But since I'm trying to order the reviews based on this, don't I need to start with Review.something so I can use order_by?
Thanks.
I would highly suggest using Multi-table inheritance instead of generic foreign keys:
class ReviewedItem(models.Model):
item_type = models.IntegerField() # exclude me on admin
def save(self):
self.item_type = self.ITEM_TYPE
super(ReviewedItem, self).save()
class Entry(ReviewedItem):
ITEM_TYPE = 1
name = models.CharField(max_length=50)
class Article(ReviewedItem):
ITEM_TYPE = 2
name = models.CharField(max_length=50)
class Review(models.Model):
item = models.ForeignKey(ReviewedItem)
rating = models.IntegerField()
timestamp = models.DateTimeField(auto_now_add=True)
After doing some research, I found out that the only way to solve my problem was to write custom sql using the "extra" method of QuerySets.

Post and Comment with the same Model

I have created a simple project where everyone can create one or more Blog.
I want to use this models for Post and for Comment:
class Post_comment(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField(_('object ID'))
content_object = generic.GenericForeignKey()
# Hierarchy Field
parent = models.ForeignKey('self', null=True, blank=True, default=None, related_name='children')
# User Field
user = models.ForeignKey(User)
# Date Fields
date_submitted = models.DateTimeField(_('date/time submitted'), default = datetime.now)
date_modified = models.DateTimeField(_('date/time modified'), default = datetime.now)
title = models.CharField(_('title'), max_length=60, blank=True, null=True)
post_comment = models.TextField(_('post_comment'))
if it is a comment the parent is not null.
So in most case the text field will contain a little bit of text.
Can I use this model for both Post and Comment ?
Is it a good solution ?
Its technically possible, but sounds like a bad solution.
Most of the queries you'll run are specific to either post or comment (examples: get all posts ordered by date to show on the blog index page, get 5 most recent posts' titles to show on a widget, get 5 most recent comments to show in a "latest comments" widget, get all comments of a specific post, get all the posts that a user has posted, etc). So the cost of having them in the same table is always having the .filter(parent=None) which means less readable code and some performance loss.