Django: retrieving ManyToManyField objects with minimum set of queries - django

My code looks like this:
models.py
class Tag(models.Model):
name = models.CharField(max_length=42)
class Post(models.Model):
user = models.ForeignKey(User, related_name='post')
#...various fields...
tags = models.ManyToManyField(Tag, null=True)
views.py
posts = Post.objects.all().values('id', 'user', 'title')
tags_dict = {}
for post in posts: # Iteration? Why?
p = Post.objects.get(pk=[post['id']]) # one extra query? Why?
tags_dict[post['id']] = p.tags.all()
How am I supposed to create a dictionary with tags for each Post object with minimum set of queries? Is it possible to avoid iterating, too?

Yes you will need a loop. But you can save one extra query in each iteration, you don't need to get post object to get all its tags. You can directly query on Tag model to get tags related to post id:
for post in posts:
tags_dict[post['id']] = Tag.objects.filter(post__id=post['id'])
Or use Dict Comprehension for efficiency:
tags_dict = {post['id']: Tag.objects.filter(post__id=post['id']) for post in posts}

If you have Django version >= 1.4 and don't really need a dictionary, but need to cut down the count of queries, you can use this method like this:
posts = Post.objects.all().only('id', 'user', 'title').prefetch_related('tags')
It seems to execute only 2 queries (one for Post and another for Tag with INNER JOIN).
And then you can access post.tags.all without extra queries, because tags was already prefetched.
{% for post in posts %}
{% for tag in post.tags.all %}
{{ tag.name }}
{% endfor %}
{% endfor %}

Related

Prefetching model with GenericForeignKey

I have a data structure in which a Document has many Blocks which have exactly one Paragraph or Header. A simplified implementation:
class Document(models.Model):
title = models.CharField()
class Block(models.Model):
document = models.ForeignKey(to=Document)
content_block_type = models.ForeignKey(to=ContentType)
content_block_id = models.CharField()
content_block = GenericForeignKey(
ct_field="content_block_type",
fk_field="content_block_id",
)
class Paragraph(models.Model):
text = models.TextField()
class Header(models.Model):
text = models.TextField()
level = models.SmallPositiveIntegerField()
(Note that there is an actual need for having Paragraph and Header in separate models unlike in the implementation above.)
I use jinja2 to template a Latex file for the document. Templating is slow though as jinja performs a new database query for every Block and Paragraph or Header.
template = get_template(template_name="latex_templates/document.tex", using="tex")
return template.render(context={'script': self.script})
\documentclass[a4paper,10pt]{report}
\begin{document}
{% for block in chapter.block_set.all() %}
{% if block.content_block_type.name == 'header' %}
\section{ {{- block.content_block.latex_text -}} }
{% elif block.content_block_type.name == 'paragraph' %}
{{ block.content_block.latex_text }}
{% endif %}
{% endfor %}
\end{document}
(content_block.latex_text() is a function that converts a HTML string to a Latex string)
Hence I would like to prefetch script.blocks and blocks.content_block. I understand that there are two methods for prefetching in Django:
select_related() performs a JOIN query but only works on ForeignKeys. It would work for script.blocks but not for blocks.content_block.
prefetch_related() works with GenericForeignKeys as well, but if I understand the docs correctly, it can only fetch one ContentType at a time while I have two.
Is there any way to perform the necessary prefetching here? Thank you for your help.
Not really an elegant solution but you can try using reverse generic relations:
from django.contrib.contenttypes.fields import GenericRelation
class Paragraph(models.Model):
text = models.TextField()
blocks = GenericRelation(Block, related_query_name='paragraph')
class Header(models.Model):
text = models.TextField()
level = models.SmallPositiveIntegerField()
blocks = GenericRelation(Block, related_query_name='header')
and prefetch on that:
Document.objects.prefetch_related('block_set__header', 'block_set__paragraph')
then change the template rendering to something like (not tested, will try to test later):
\documentclass[a4paper,10pt]{report}
\begin{document}
{% for block in chapter.block_set.all %}
{% if block.header %}
\section{ {{- block.header.0.latex_text -}} }
{% elif block.paragraph %}
{{ block.paragraph.0.latex_text }}
{% endif %}
{% endfor %}
\end{document}
My bad, I did not notice that document is an FK, and reverse FK can not be joined with select_related.
First of all, I would suggest to add related_name="blocks" anyway.
When you prefetch, you can pass the queryset. But you should not pass filters by doc_id, Django's ORM adds it automatically.
And if you pass the queryset, you can also add select/prefetch related call there.
blocks_qs = Block.objects.all().prefetch_related('content_block')
doc_prefetched = Document.objects.prefetch_related(
Prefetch('blocks', queryset=blocks_qs)
).get(uuid=doc_uuid)
But if you don't need extra filters or annotation, the simpler syntax would probably work for you
document = (
Document.objects
.prefecth_related('blocks', 'blocks__content_block')
.get(uuid=doc_uuid)
)

What is the most db efficient way to get specific fields of a foreign key in a queryset using Wagtail page models

So I have this model:
class FieldPosition(Orderable):
page = ParentalKey('PlayerDetailPage', on_delete=models.CASCADE, related_name='field_position_relationship')
field_position = models.CharField(verbose_name=_('Field position'), max_length=3, choices=FIELD_POSITION_CHOICES, null=True)
And this Wagtail page model:
class PlayerDetailPage(Page):
content_panels = [ InlinePanel('field_position_relationship', label=_('Field position'), max_num=3), ]
I made a property in the page model:
#property
def position(self):
position = [
n.field_position for n in self.field_position_relationship.all()
]
return position
So I can access the field_position in my template with:
{% for p in page.position %}
{{ p }}
{% endfor %}
But is there any better way to do this? I feel like I am hitting the database too much and I have a lot of queries. I would prefer something like:
PlayerDetailPage.objects.values('field_position_relationship__field_position')
But then I get the values of all PlayerDetailPage entries , I just want the value of the particular page that is displaying. How can I achieve something like that?
Probably something like this is what you're after. It is the combination of the two answers you had.
#property
def position(self):
return self.field_position_relationship.values('field_position')

Flask SQLAlchemy Foreign Key Querying

I am quite new to Python and programming in general. I could use some help with logic.
I have 3 tables, users, posts, comments. Table comments has a column "user_id" which has a Foreign_Key to the users column "id". Also as a post_id with FK to post_id.
comments.user_id = users.id && comments_post_id = posts.id
To query a specific post I am using:
post = Posts.query.filter_by(id=post_id).first_or_404()
To query the comments I am using:
comments = Comments.query.filter_by(post_id=post_id).order_by(Comments.time.desc()).all()
I would like to get the username from the users table where the comments.user_id.... This is where my logic gets scrambled.
Using the FK can I directly reference Users.username? Or does the FK just map a relationship so I can run the query. Where comments.user_id = Users.id .... get username and everything. I feel I am doing this quite inefficiently.
I'd like to replace comment.user_id with "Users.username where comment.user_id = user_id"
{% if comments %}
{% for comment in comments %}
{{ comment.user_id }}
{{ comment.time }}
{{ comment.comment }}
{% endfor %}
{% endif %}
I think what I need to do I make a new column in comments, comments.user_username & give it an FK of users.username. I keep getting a 150 error and the columns are identical. Same collation, datatype and length.
You can define a attribute which type is relationship of SQLAlchemy as follows:
class Comment(db.Model):
id = db.Column(db.Integer(), primary_key=1)
content = db.Column(db.Text(), nullable=0)
create_user_id = db.Column(db.Integer(), db.ForeignKey("user.id"))
create_user = db.relationship("User", foreign_keys=create_user_id)
post_id = dn.Column(db.Integer(), db.ForeignKey("post.id"))
You can write comment.create_user.username in the template file.

Django query caching through a ForeignKey on a GenericRelation

Using a GenericRelation to map Records to Persons, I have the following code, which works correctly, but there is a performance problem I am trying to address:
models.py
class RecordX(Record): # Child class
....
class Record(models.Model): # Parent class
people = GenericRelation(PersonRecordMapping)
def people_all(self):
# Use select_related here to minimize the # of queries later
return self.people.select_related('person').all()
class Meta:
abstract = True
class PersonRecordMapping(models.Model):
person = models.ForeignKey(Person)
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
class Person(models.Model):
...
In summary I have:
RecordX ===GenericRelation===> PersonRecordMapping ===ForeignKey===> Person
The display logic in my application follows these relationships to display all the Persons mapped to each RecordX:
views.py
#rendered_with('template.html')
def show_all_recordXs(request):
recordXs = RecordX.objects.all()
return { 'recordXs': recordXs }
The problem comes in the template:
template.html
{% for recordX in recordXs %}
{# Display RecordX's other fields here #}
<ul>
{% for map in record.people_all %}{# <=== This generates a query every time! #}
<li>{{ map.person.name }}</li>
{% endfor %}
</ul>
{% endfor %}
As indicated, every time I request the Persons related to the RecordX, it generates a new query. I cannot seem to figure out how prefetch these initially to avoid the redundant queries.
If I try selected_related, I get an error that no selected_related fields are possible here (error message: Invalid field name(s) given in select_related: 'xxx'. Choices are: (none)). Not surprising, I now see -- there aren't any FKs on this model.
If I try prefetch_related('people'), no error is thrown, but I still get the same repeated queries on every loop in the template as before. Likewise for prefetch_related('people__person'). If I try prefetch_related('personrecordmapping'), I get an error.
Along the lines of this answer, I considered doing trying to simulate a select_related via something like:
PersonRecordMapping.objects.filter(object_id__in=RecordX.objects.values_list('id', flat=True))
but I don't understand the _content_object_cache well enough to adapt this answer to my situation (if even the right approach).
So -- how can I pre-fetch all the Persons to the RecordXs to avoid n queries when the page will display n RecordX objects?
Thanks for your help!
Ack, apparently I needed to call both -- prefetch_related('people', 'people__person'). SIGH.

Django: using aggregates to show vote counts for basic polls app

I'm making a very basic poll app. It's similar to the one in the Django tutorial but I chose to break out the vote counting aspect into its own model (the tutorial just adds a vote count field alongside each answer). Here's my models:
class PollQuestion(models.Model):
question = models.CharField(max_length=75)
class PollAnswer(models.Model):
poll = models.ForeignKey('PollQuestion')
answer = models.CharField(max_length=75)
class PollVote(models.Model):
poll = models.ForeignKey('PollQuestion')
answer = models.ForeignKey('PollAnswer')
date_voted = models.DateTimeField(auto_now_add=True)
user_ip = models.CharField(max_length=75)
I'm trying to show all of the vote counts for a given poll. Here's my view code:
from django.db.models import Count
poll_votes = PollVote.objects.select_related('PollAnswer').filter(poll=poll_id).annotate(num_votes=Count('answer__id'))
When I output the results of this query I just get a single row per vote (eg I see about 40 'answers' for my poll, each one representing a vote for one of the 5 actual PollAnswers). If I look at the queries Django makes, it runs something like this for every vote in the poll:
SELECT `poll_answers`.`id`, `poll_answers`.`poll_id`, `poll_answers`.`answer`
FROM `poll_answers`
WHERE `poll_answers`.`id` = 101
Can anyone poke me in the right direction here? I get the feeling this should be easy.
EDIT: here's my template code, for completeness.
<ul>
{% for vote in votes %}
{{ vote.answer }} ({{ votes.num_votes }})<br />
{% endfor %}
</ul>
Try:
poll_votes = PollVote.objects.filter(poll=poll_id).annotate(num_votes=Count('answer__id'))
or:
poll_votes = PollVote.objects.values('poll', 'answer__answer').filter(poll=poll_id).annotate(num_votes=Count('answer__id'))
Relevant docs:
Django offical docs
Never mind, fixed it myself after finding a tutorial which uses the same sort of models as me.
Essentially the fix was in the view:
p = get_object_or_404(PollQuestion, pk=poll_id)
choices = p.pollanswer_set.all()
And in the template:
{% for choice in choices %}
<p class="resultsList">{{choice.answer}} - {{choice.pollvote_set.count}}</p>
{% endfor %}