Query optimization: howto

Query optimization: howto - django

I have my models like this:
class Personne(BaseModel):
# [skip] many fields then:
photos = models.ManyToManyField(Photo, blank=True,
through='PersonnePhoto',
symmetrical=False,
related_name='parent')
def photo_profil(self):
a = PersonnePhoto.objects.filter(
personne=self, photo_type=PersonnePhoto.PHOTO_PROFIL)
return a[0] if len(a) else None
# [skip] many fields then:
travels = models.ManyToManyField(
TagWithValue, blank=True,
through='PersonneTravel',
default=None, symmetrical=False,
related_name='personne_travel')
class PersonneRelation(BaseModel):
src = models.ForeignKey('Personne', related_name='src')
dst = models.ForeignKey('Personne', related_name='dst')
opposite = models.ForeignKey('PersonneRelation',
null=True, blank=True, default=None)
is_reverse = models.BooleanField(default=False)
I need to show all contacts of one person, and for each contact, show all his/her travels, and his/her photo.
Here's my code in my view:
class IndexView(LoginRequiredMixin, generic.TemplateView):
template_name = 'my_home/contacts/index.html'
def get_context_data(self, **kwargs):
context = super(IndexView, self).get_context_data(**kwargs)
context['contacts'] = Personne.objects.get(
user=self.request.user).relations.all()
return context
Pretty simple.
The problem is in my template. my_home/contacts/index.html includes contact_detail.html for each contact:
{% for c in contacts %}
{% with c as contact %}
{% include 'includes/contact_detail.html' %}
{% endwith %}
{% endfor %}
In contact_detail.html, I call photo_profil, which makes a query to get the value of the picture of the contact.
{{ contact.photo_profil }}
This implies that if a user has 100 contacts, I will do one query for all his contacts, then 100 queries for each contact. How is it possible to optimize this? I have the same problem for travels, and the same problem for each ManyToMany fields of the contact, actually.

Looks like you need some prefetch_related goodness:
context['contacts'] = (Personne.objects
.get(user=self.request.user)
.relations
.all()
.prefetch_related('travels', 'photos'))
Note, however, that it will fetch all the contacts' photos, which is not you seem to expect. Basically you have two options here. Either add some hairy raw SQL to the Prefetch object's queryset parameter. Or (as I did in one of the projects) designate a separate main_photo field for storing the user's avatar (a separate ForeignKey to the Photo, I mean, not the completely independent one). Yes, it's clearly a denormalization, but queries become much more simple. And your users get a way to set a main photo, after all.

Related

DJANGO Supplement to the data list via an intermediate table

I have an intermediate models connection
Simplified:
class Person(models.Model):
first_name = models.CharField(max_length=20)
last_name = models.CharField(max_length=20)
etc...
class Company(models.Model):
name = models.CharField(max_length=60)
etc...
class CompanyEnrollment(models.Model):
person = models.ForeignKey(Person, on_delete=models.CASCADE)
company = models.ForeignKey(Company, on_delete=models.CASCADE)
company_position =
models.ForeignKey(CompanyPosition,on_delete=models.CASCADE)
etc...
class Meta:
unique_together = [['person', 'company']]
class CompanyPosition(models.Model):
name = models.CharField(max_length=60)
I want to create the following array:
datas = Person.objects.All(...{All elements of Person model supplemented with CompanyPosition_name}...)
There is a case where a person does not have a company_name association
Is it possible to solve this with a query?

If you have an explicit through model for your many-to-many relationship, you could use this to only get or exclude based on the entries of this table, using an Exists clause:
enrollments = CompanyEnrollment.objects.filter(person_id=OuterRef('pk'))
enrolled_persons = Person.objects.filter(Exists(enrollments))
unenrolled_persons = Person.objects.filter(~Exists(enrollments))
If you don't have an explicit intermediate table, then it's has been generated by Django directly. You should be able to use it by referring to it with Person.enrollments.through instead of CompanyEnrollment.
Since you did not detailed much the CompanyEnrollment table, I assumed you're in the second case.

This is not the best solution in my opinion, but it works for now, with little data. I think this is a slow solution for a lot of data.
views.py I will compile the two necessary dictionaries
datas = Person.objects.all().order_by('last_name', 'first_name')
companys = CompanyEnrollment.objects.all()
Use Paginator
p = Paginator(Person.objects.all(), 20)
page = request.GET.get('page')
pig = p.get_page(page)
pages = "p" * pig.paginator.num_pages
Render HTML:
context = {
"datas": datas,
"pig": pig,
"pages": pages,
"companys": companys,
}
return render(request, "XY.html", context)
HTML template:
{% for datas in pig %}
{{datas.first_name}} {{datas.last_name}}
{% for company in companys %}
{% if datas == company.person %}
{{company.company.name}} <br>
{% endif %}
{% endfor %}
{% endfor %}
not the most beautiful, but it works ... I would still accept a more elegant solution.

Display objects from different models at the same page according to their published date

I have three different models for my app. All are working as I expected.
class Tender(models.Model):
title = models.CharField(max_length=256)
description = models.TextField()
department = models.CharField(max_length=50)
address = models.CharField(max_length=50)
nature_of_work = models.CharField(choices=WORK_NATURE, max_length=1)
period_of_completion = models.DateField()
pubdat = models.DateTimeField(default=timezone.now)
class Job(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL)
title = models.CharField(max_length=256)
qualification = models.CharField(max_length=256)
interview_type = models.CharField(max_length=2, choices=INTERVIEW_TYPE)
type_of_job = models.CharField(max_length=1, choices=JOB_TYPE)
number_of_vacancies = models.IntegerField()
employer = models.CharField(max_length=50)
salary = models.IntegerField()
pubdat = models.DateTimeField(default=timezone.now)
class News(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL)
title = models.CharField(max_length=150)
body = models.TextField()
pubdat = models.DateTimeField(default=timezone.now)
Now I am displaying each of them at separate page for each of the model (e.g. in the jobs page, I am displaying only the jobs.). But now at the home page, I want to display these according to their published date at the same page. How can I display different objects from different models at the same page? Do I make a separate model e.g. class Post and then use signal to create a new post whenever a new object is created from Tender, or Job, or News? I really hope there is a better way to achieve this. Or do I use multi-table inheritance? Please help me. Thank you.
Update:
I don't want to show each of the model objects separately at the same page. But like feeds of facebook or any other social media. Suppose in fb, any post (be it an image, status, share) are all displayed together within the home page. Likewise in my case, suppose a new Job object was created, and after that a new News object is created. Then, I want to show the News object first, and then the Job object, and so on.

A working solution
There are two working solutions two other answers. Both those involve three queries. And you are querying the entire table with .all(). The results of these queries combined together into a single list. If each of your tables has about 10k records, this is going to put enormous strain on both your wsgi server and your database. Even if each table has only 100 records each, you are needlessly looping 300 times in your view. In short slow response.
An efficient working solution.
Multi table inheritance is definitely the right way to go if you want a solution that is efficient. Your models might look like this:
class Post(models.Model):
title = models.CharField(max_length=256)
description = models.TextField()
pubdat = models.DateTimeField(default=timezone.now, db_index = True)
class Tender(Post):
department = models.CharField(max_length=50)
address = models.CharField(max_length=50)
nature_of_work = models.CharField(choices=WORK_NATURE, max_length=1)
period_of_completion = models.DateField()
class Job(Post):
user = models.ForeignKey(settings.AUTH_USER_MODEL)
qualification = models.CharField(max_length=256)
interview_type = models.CharField(max_length=2, choices=INTERVIEW_TYPE)
type_of_job = models.CharField(max_length=1, choices=JOB_TYPE)
number_of_vacancies = models.IntegerField()
employer = models.CharField(max_length=50)
salary = models.IntegerField()
class News(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL)
def _get_body(self):
return self.description
body = property(_get_body)
now your query is simply
Post.objects.select_related(
'job','tender','news').all().order_by('-pubdat') # you really should slice
The pubdat field is now indexed (refer the new Post model I posted). That makes the query really fast. There is no iteration through all the records in python.
How do you find out which is which in the template? With something like this.
{% if post.tender %}
{% else %}
{% if post.news %}
{% else %}
{% else %}
Further Optimization
There is some room in your design to normalize the database. For example it's likely that the same company may post multiple jobs or tenders. As such a company model might come in usefull.
An even more efficient solution.
How about one without multi table inheritance or multiple database queries? How about a solution where you could even eliminate the overhead of rendering each individual item?
That comes with the courtesy of redis sorted sets. Each time you save a Post, Job or News, object you add it to a redis sorted set.
from django.db.models.signals import pre_delete, post_save
from django.forms.models import model_to_dict
#receiver(post_save, sender=News)
#receiver(post_save, sender=Post)
#receiver(post_save, sender=Job)
def add_to_redis(sender, instance, **kwargs):
rdb = redis.Redis()
#instead of adding the instance, you can consider adding the
#rendered HTML, that ought to save you a few more CPU cycles.
rdb.zadd(key, instance.pubdat, model_to_dict(instance)
if (rdb.zcard > 100) : # choose a suitable number
rdb.zremrangebyrank(key, 0, 100)
Similarly, you need to add a pre_delete to remove them from redis
The clear advantage of this method is that you don't need any database queries at all and your models continue to be really simple + you get catching thrown in the mix. If you are on twitter your timeline is probably generated through a mechanism similar to this.

The following should do want you need. But to improve performance you can create an extra type field in each of your models so the annotation can be avoided.
Your view will look something like:
from django.db.models import CharField
def home(request):
# annotate a type for each model to be used in the template
tenders = Tender.object.all().annotate(type=Value('tender', CharField()))
jobs = Job.object.all().annotate(type=Value('job', CharField()))
news = News.object.all().annotate(type=Value('news', CharField()))
all_items = list(tenders) + list(jobs) + list(news)
# all items sorted by publication date. Most recent first
all_items_feed = sorted(all_items, key=lambda obj: obj.pubdat)
return render(request, 'home.html', {'all_items_feed': all_items_feed})
In your template, items come in the order they were sorted (by recency), and you can apply the appropriate html and styling for each item by distinguishing with the item type:
# home.html
{% for item in all_items_feed %}
{% if item.type == 'tender' %}
{% comment "html block for tender items" %}{% endcomment %}
{% elif item.type == 'news' %}
{% comment "html block for news items" %}{% endcomment %}
{% else %}
{% comment "html block for job items" %}{% endcomment %}
{% endif %}
{% endfor %}
You may avoid the annotation altogether by using the __class__ attribute of the model objects to distinguish and put them in the appropriate html block.
For a Tender object, item.__class__ will be app_name.models.Tender where app_name is the name of the Django application containing the model.
So without using annotations in your home view, your template will look:
{% for item in all_items_feed %}
{% if item.__class__ == 'app_name.models.Tender' %}
{% elif item.__class__ == 'app_name.models.News' %}
...
{% endif %}
{% endfor %}
With this, you save extra overhead on the annotations or having to modify your models.

A straight forward way is to use chain in combination with sorted:
View
# your_app/views.py
from django.shortcuts import render
from itertools import chain
from models import Tender, Job, News
def feed(request):
object_list = sorted(chain(
Tender.objects.all(),
Job.objects.all(),
News.objects.all()
), key=lambda obj: obj.pubdat)
return render(request, 'feed.html', {'feed': object_list})
Please note - the querysets mentioned above using .all() should be understood as placeholder. As with a lot of entries this could be a performance issue. The example code would evaluate the querysets first and then sort them. Up to some hundreds of records it likely will not have a (measurable) impact on performance - but in a situation with millions/billions of entries it is worth looking at.
To take a slice before sorting use something like:
Tender.objects.all()[:20]
or use a custom Manager for your models to off-load the logic.
class JobManager(models.Manager):
def featured(self):
return self.get_query_set().filter(featured=True)
Then you can use something like:
Job.objects.featured()
Template
If you need additional logic depending the object class, create a simple template tag:
#templatetags/ctype_tags.py
from django import template
register = template.Library()
#register.filter
def ctype(value):
return value.__class__.__name__.lower()
and
#templates/feed.html
{% load ctype_tags %}
<div>
{% for item in feed reversed %}
<p>{{ item|ctype }} - {{ item.title }} - {{ item.pubdat }}</p>
{% endfor %}
</div>
Bonus - combine objects with different field names
Sometimes it can be required to create these kind of feeds with existing/3rd party models. In that case you don't have the same fieldname for all models to sort by.
DATE_FIELD_MAPPING = {
Tender: 'pubdat',
Job: 'publish_date',
News: 'created',
}
def date_key_mapping(obj):
return getattr(obj, DATE_FIELD_MAPPING[type(obj)])
def feed(request):
object_list = sorted(chain(
Tender.objects.all(),
Job.objects.all(),
News.objects.all()
), key=date_key_mapping)

Do I make a separate model e.g. class Post and then use signal to
create a new post whenever a new object is created from Tender, or
Job, or News? I really hope there is a better way to achieve this. Or
do I use multi-table inheritance?
I don't want to show each of the model objects separately at the same
page. But like feeds of facebook or any other social media.
I personally don't see anything wrong using another model, IMHO its even preferable to use another model, specially when there is an app for that.
Why? Because I would never want to rewrite code for something which can be achieved by extending my current code. You are over-engineering this problem, and if not now, you are gonna suffer later.

An alternative solution would be to use Django haystack:
http://haystacksearch.org/
http://django-haystack.readthedocs.io/en/v2.4.1/
It allows you to search through unrelated models. It's more work than the other solutions but it's efficient (1 fast query) and you'll be able to easily filter your listing too.
In your case, you will want to define pubdate in all the search indexes.

I cannot test it right now, but you should create a model like:
class Post(models.Model):
pubdat = models.DateTimeField(default=timezone.now)
tender = models.ForeignKey('Tender')
job = models.ForeignKey('Job')
news = models.ForeignKey('News')
Then, each time a new model is created, you create a Post as well and relate it to the Tender/Job/News. You should relate each post to only one of the three models.
Create a serializer for Post with indented serializers for Tender, Job and News.
Sorry for the short answer. If you think it can work for your problem, i'll write more later.

Django template 'IF' condition

I want to do something like
{% if "sumit" in feed.like.person.all %}
But this gives me TemplateSyntaxError. How can I do this in Djagno ?
(Basically, I want to check if 'sumit' exists in feed.like.person.all)
Here are my relevant models.
class Feed(models.Model):
name = models.CharField(max_length=120)
text = models.CharField(max_length=1200)
timestamp = models.DateTimeField(auto_now=True, auto_now_add=False)
updated = models.DateTimeField(auto_now=False, auto_now_add=True)
class Like(models.Model):
feed = models.OneToOneField(Feed)
counter = models.PositiveIntegerField()
person = models.ManyToManyField(settings.AUTH_USER_MODEL, null=True, blank=True)

I think you intended to check the following:
# check if current user likes a feed
{% if request.user in feed.like.person.all %}
But if you are checking this for multiple feeds, then this method becomes inefficient. For multiple feeds, better approach is to use Annotations as mentioned by #AKS.

Your approach to check if a user likes a feed within the templates by querying for each feed is very inefficient.
I would suggest using Conditional Expressions to annotate each feed while fetching the queryset:
from django.db.models import BooleanField, Case, When, Value
feeds = Feed.objects.all().annotate(
is_liked=Case(
When(like__person=request.user, then=Value(True)),
default=Value(False),
output_field=BooleanField()))
This way you would be getting everything in one query only. And, then in the template you can just check is_liked on the feed:
{% if feed.is_liked %}You like this.{% endif %}
I haven't really executed this query but looking at the documentation it would be something similar.

List of Articles form FeinCMS Content Typs

my mission is to get a list of articles. This article come form a simple FeinCMS ContentType.
class Article(models.Model):
image = models.ForeignKey(MediaFile, blank=True, null=True, help_text=_('Image'), related_name='+',)
content = models.TextField(blank=True, help_text=_('HTML Content'))
style = models.CharField(
_('template'),max_length=10, choices=(
('default', _('col-sm-7 Image left and col-sm-5 Content ')),
('fiftyfifty', _('50 Image left and 50 Content ')),
('around', _('small Image left and Content around')),
),
default='default')
class Meta:
abstract = True
verbose_name = u'Article'
verbose_name_plural = u'Articles'
def render(self, **kwargs):
return render_to_string('content/articles/%s.html' % self.style,{'content': self,})
I would like to use that in different subpages.
Now it would be great to get a list of all articels on the main page (my projects -> list of project1, project2, project3, ).
Something like: Article.objects.all()
Template:
{% for entry in article %}
{% if content.parent_id == entry.parent_id %} #only projects
<p>{{ entry.content|truncatechars:180 }}</p>
{% endif %}
{% endfor %}
but i get a error "type object 'Articels' has no attribute 'objects'...
Do you have a smart idea? It would be grade to use Feincms ContentType.

FeinCMS content types are abstract, that means there is no data and no database table associated with them. Therefore, there's no objects manager and no way to query.
When doing Page.create_content_type(), FeinCMS takes the content type and the corresponding Page class and creates a (non-abstract) model which contains the actual data. In order to access that new, concrete model, you need to use content_type_for. In other words, you're looking for:
from feincms.module.page.models import Page
PageArticle = Page.content_type_for(Article)
articles = PageArticle.objects.all()

Count method in Django template not working as expected

I'm building a news app that allows members to post comments on articles. I want to display the articles in my template and also display the number of comments that have been made on each article. I tried using the count method but, it retrieved the total number of comments in my comments table instead of the number of comments that a particular article has.
#models.py
class Article(models.Model):
#auto-generate indices for our options
ENTRY_STATUS = enumerate(('no', 'yes'))
#this will be a foreign key once account app is built
author = models.CharField(default=1, max_length=1)
category = models.ForeignKey(Category)
title = models.CharField(max_length=50)
entry = models.TextField()
dateposted = models.DateTimeField(default=timezone.now, auto_now_add=True)
draft = models.IntegerField(choices=ENTRY_STATUS, default=0)
lastupdated = models.DateTimeField(default=timezone.now, auto_now=True)
#prevents the generic labeling of our class as 'Classname object'
def __unicode__(self):
return self.title
class Comment(models.Model):
#this will be a foreign key once account app is built
author = models.CharField(default=1, max_length=1)
article = models.ForeignKey(Article)
dateposted = models.DateTimeField(auto_now_add=True)
comment = models.TextField()
def __unicode__(self):
#returns the dateposted as a unicode string
return unicode(self.dateposted)
#templates/list_articles.html
{% for comment in comments %}
{% if comment.article_id == article.id %}
{% if comments.count < 2 %}
#this is returning all comments in comment table
<b>{{ comments.count }} comment</b>
{% else %}
<b>{{ comments.count }} comments</b>
{% endif %}
{% endif %}
{% endfor %}
All the examples I've seen so far manually provide a value to filter by(e.g. Comment.objects.filter(article_id=x).count() ) In my case I only have access via the template.
#views.py
class ArticlesListView(ListView):
context_object_name = 'articles'
# only display published pieces (limit 5)
queryset = Article.objects.select_related().order_by('-dateposted').filter(draft=0)[:5]
template_name = 'news/list_articles.html'
# overide this to pass additional context to templates
def get_context_data(self, **kwargs):
context = super(ArticlesListView, self).get_context_data(**kwargs)
#get all comments
context['comments'] = Comment.objects.order_by('-dateposted')
#get all article photos
context['photos'] = Photo.objects.all()
#context['total_comments'] = Comment.objects.filter(article_id=Article)
return context
My intended result is to have a listing of all articles and a roll-up of comments made on that article below each article(e.g. Article 1: 4 comments, Article 5: 1 comment, etc...) Right now I'm getting: Article 1: 4 comments, Article 5: 4 comments(even though Article 5 only has 1 comment)
Any help is appreciated. I've spent 5 hours reading through the documentation but every example manually provides a value to filter by.

I'm not sure why you find this unexpected. comments is all the comments, so of course comments.count is a count of all the comments. How could it be otherwise? You don't filter them anywhere.
This is however a really really horrible way to do things. There is absolutely no reason to pass all comments to the template and then iterate through them to check if they're the right article. You have a foreign key from Comment to Article, so you should use the reverse relationship to get the relevant commments.
Leave out the Comment query altogether from your view, and in your template just do this (replacing that whole block of nested fors and ifs):
{{ article.comment_set.count }}
This however does one count query per article. A better solution is to use annotations, so you can do it all in one single query. Change your queryset to add the annotated count of related comments:
from django.db.models import Count
class ArticlesListView(ListView):
queryset = Article.objects.select_related().annotate(comment_count=Count('comments')).order_by('-dateposted').filter(draft=0)
and now you can just do
{{ article.comment_count }}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Query optimization: howto - django

Related

DJANGO Supplement to the data list via an intermediate table

Display objects from different models at the same page according to their published date

Django template 'IF' condition

List of Articles form FeinCMS Content Typs

Count method in Django template not working as expected

Categories

Resources