Optimal project organization and querysets - django

I have 2 models Company and Product with FK on Product:
class Product(Meta):
company = models.ForeignKey(Company, related_name='products', on_delete=models.CASCADE)
In case of a View that will gather company products what is the optimal approach(use infor form both models):
1) add the View in companies app and as queryset use:
Company.objects.prefetch_related('products').get(pk=company_pk)
2) add the View in products app and as queryset use:
Product.objects.select_related('company').filter(company=company_pk)
What about ordering can be chained with prefetch or select ?

The Django docs illustrate the difference quite well:
prefetch_related(*lookups)
Returns a QuerySet that will
automatically retrieve, in a single batch, related objects for each of
the specified lookups.
This has a similar purpose to select_related, in that both are
designed to stop the deluge of database queries that is caused by
accessing related objects, but the strategy is quite different.
select_related works by creating an SQL join and including the fields
of the related object in the SELECT statement. For this reason,
select_related gets the related objects in the same database query.
However, to avoid the much larger result set that would result from
joining across a ‘many’ relationship, select_related is limited to
single-valued relationships - foreign key and one-to-one.
select_related(*fields)
Returns a QuerySet that will “follow” foreign-key relationships,
selecting additional related-object data when it executes its query.
This is a performance booster which results in a single more complex
query but means later use of foreign-key relationships won’t require
database queries.

Related

Override queryset 'only' method | Django

For reducing the number of queries when people access frequently queried foreign key references, I've created a custom manager which includes select_related always for the foreign key relation.
Now that there are places in code where people have used only on this same model when creating a query set. How can I override only query set method to include this foreign key relation always.
class CustomManager(models.Manager):
def get_queryset(self):
return super(CustomManager, self).get_queryset().select_related('user')
When this model is used as below, I run into an error which says Field cannot be both deferred and traversed using select_related at the same time.
Model.objects.only('id','field_1').get(pk=1)
Anyway to mitigate this? I've to use select_related as above, since it will save lot of queries.

Django how to fetch related objects with a join?

My models are similar to the following:
class Reporter(models.Model):
def gold_star(self):
return self.article_set.get().total_views >= 100000
class Article(models.Model):
reporter = models.ForeignKey(Reporter, on_delete=models.CASCADE)
total_views = models.IntegerField(default=0, blank=True)
Then in one of the templates I have this line:
{% if r.gold_star %}<img src="{% static 'gold-star.png' %}">{% endif %}
Obviously django sends as many queries as there are reporters on the page... Ideally this could be just one query, which would select reporters by criteria and join appropriate articles. Is there a way?
EDIT
Neither select_related nor prefetch_related doesn't seem to work as I'm selecting on the Reporter table and then use RelatedManager to access related data on the Article.
In other words django doesn't know what to prefetch until there's non empty queryset.
Because an article can only have one reporter it's for sure possible to join these tables together and then apply filter to subquery, I just can't find how it's done in django query language.
There's alternative - select on the Article table and filter by Reporter fields, but there's a problem with such approach. If I deleted all the articles of some reporter then I wouldn't be able to include that reporter in the list as from the Article point of view such reporter doesn't exist and yet reporter is in the Reporter table.
EDIT2
I tried what people suggested in the comments. The following generates desired query:
reporters = Reporter.objects.filter(**query).select_related().annotate(
gold_star=Case(
When(article__total_views__gte=0, then=Value(1)),
default=Value(0),
output_field=IntegerField()
)
)
Query generated by django:
SELECT
`portal_reporter`.`id`,
...,
CASE WHEN `portal_article`.`total_views` >= 0 THEN 1 ELSE 0 END AS `gold_star`
FROM
`portal_reporter`
LEFT OUTER JOIN `portal_article`
ON (`portal_reporter`.`id` = `portal_article`.`reporter_id`)
WHERE
...
Now I just need to work out a way how to produce similar query but without Case/When statements.
EDIT3
If I chose slightly different strategy, then django selects wrong join type:
query['article__id__gte'] = 0
reporters = Reporter.objects.filter(**query).select_related()
This code produce similar query but with the INNER JOIN instead of desired LEFT OUTER JOIN:
SELECT
`portal_reporter`.`id`,
...,
FROM
`portal_reporter`
INNER JOIN `portal_article`
ON (`portal_reporter`.`id` = `portal_article`.`reporter_id`)
WHERE
...
You can use select_related (https://docs.djangoproject.com/en/1.11/ref/models/querysets/#select-related) to do a join on the related table.
There's also prefetch_related (https://docs.djangoproject.com/en/1.11/ref/models/querysets/#prefetch-related) which uses an IN clause to fetch the related objects with an extra query. The difference is explained in the docs, but is reproduced below:
select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.
prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related. It also supports prefetching of GenericRelation and GenericForeignKey, however, it must be restricted to a homogeneous set of results. For example, prefetching objects referenced by a GenericForeignKey is only supported if the query is restricted to one ContentType.
Try annotating the new field gold_star and set it to 1 if reporter has an article that has more than 100000 total_views like this:
from django.db.models import Case, When, Value, IntegerField
reporters = Reporter.objects.annotate(
gold_star=Case(
When(article__total_views__gte=100000, then=Value(1)),
default=Value(0),
output_field=IntegerField()
)
)
You can leave the template code as it is.

django prefetch_related id only

I'm trying to optimise my queries but prefetch_related insists on joining the tables and selecting all the fields even though I only need the list of ids from the relations table.
You can ignore the 4th query. It's not related to the question.
Related Code:
class Contact(models.Model):
...
Groups = models.ManyToManyField(ContactGroup, related_name='contacts')
...
queryset = Contact.objects.all().prefetch_related('Groups')
Django 1.7 added Prefetch objects which let you customise the queryset used when prefetching.
In particular, see only().
In this case, you'd want something like:
queryset = Contact.objects.all().prefetch_related(
Prefetch('Groups', queryset=Group.objects.all().only('id')))

Django ManyToMany in one query

I'm trying to optimise my app by keeping the number of queries to a minimum... I've noticed I'm getting a lot of extra queries when doing something like this:
class Category(models.Model):
id = models.AutoField(primary_key=True)
name = models.CharField(max_length=127, blank=False)
class Project(models.Model):
categories = models.ManyToMany(Category)
Then later, if I want to retrieve a project and all related categories, I have to do something like this :
{% for category in project.categories.all() %}
Whilst this does what I want it does so in two queries. I was wondering if there was a way of joining the M2M field so I could get the results I need with just one query? I tried this:
def category_list(self):
return self.join(list(self.category))
But it's not working.
Thanks!
Which, whilst does what I want, adds an extra query.
What do you mean by this? Do you want to pick up a Project and its categories using one query?
If you did mean this, then unfortunately there is no mechanism at present to do this without resorting to a custom SQL query. The select_related() mechanism used for foreign keys won't work here either. There is (was?) a Django ticket open for this but it has been closed as "wontfix" by the Django developers.
What you want is not seem to possible because,
In DBMS level, ManyToMany relatin is not possible, so an intermediate table is needed to join tables with ManyToMany relation.
On Django level, for your model definition, django creates an ectra table to create a ManyToMany connection, table is named using your two tables, in this example it will be something like *[app_name]_product_category*, and contains foreignkeys for your two database table.
So, you can not even acces to a field on the table with a manytomany connection via django with a such categories__name relation in your Model filter or get functions.

custom query in django

I'm building an ecommerce website.
I have a Product model that holds info common to all product types:
class Product(models.Model):
name=models.CharField()
description=models.CharField()
categories = models.ManyToManyField(Category)
Then I have SimpleProduct and BundleProduct that have FK to Product and hold info specific to the product type. BundleProduct has a m2m field to other Products.
class SimpleProduct(Product):
some_field=models.CharField()
class BundleProduct(Product):
products = models.ManyToManyField(Product)
When displaying the catalog I'm making one query against the Product model
and then another query per product to get the additional info.
This involve a large number of queries.
I can improve it by using select_related on the simpleproduct and bundleproduct fields.
I can further improve it by using the select_reverse app for m2m fields like categories.
This is a big improvement but there are more required queries because a BundleProduct have several products which can also have relations to other products (configurable product).
Is there a way to have a single query against Product that will retrieve the m2m categories, one2one SimpleProduct and BundleProduct and the BundleProduct's products?
Will this custom query look like a django queryset with all the managers and properties?
Thanks
You can possibly take a look at the extra method of querysets. May give you the opportunity to add some additional fields. But if you want raw queries, you can use the raw method of managers, these will return a type of queryset, that will not however harness the full power of normal querysets but should be enough for your concerns. On that same page the execute method is also shown, this is for truly custom sql that can't even translate into raw querysets.