I need to get list of all companies and join the company user with minimal companyuser id.
There are two models:
class Company(models.Model):
name = models.CharField(max_length=255)
kind = models.CharField(max_length=255)
class CompanyUser(models.Model):
company = models.ForeignKey('Company')
email = models.EmailField(max_length=40, unique=True)
#other fields
I've tried something like this:
companies = Company.objects.all().select_related(Min('companyuser__email'))
but It doesn't work. How can I do this with Django ORM? Is there any way to do it without raw SQL?
from django.db.models import Min
Company.objects.annotate(lowest_companyuser_id=Min("companyuser__id"))
Explanation
select_related() can be used for telling Django which related tables should be joined to the resulting queryset for reducing the number of queries, namely solving the dreaded "N+1 problem" when looping over a queryset and accessing related objects in iteration. (see docs)
With using Min() you were on the right track, but it ought to be used in conjunction with the annotate() queryset method. Using annotate() with aggregate expressions like Min(), Max(), Count(), etc. translates in an SQL query using one of the aforementioned aggregate expressions with GROUP BY. (see docs about annotate() in Django, about GROUP BY in Postgres docs)
As Burhan said - do not rely on the pk, but if u must...
companies = Company.objects.all().order_by('pk')[0]
Related
I had worked by django 2.X. But I'm going to use django3.x at my new project.
At version2, when I should make outer join. I used prefetch_related and filtered about model of prefetch_related.
In version 2, if I use prefetch_related it was queried as single query. but in version 3, queried by multiple query.
If I only use Q() of joined target without prefetch_related, it works single query at version 3.
from django.db import models
from django.db.models import Q
from django.db.models import Prefetch
class Member(models.Model):
member_no = models.AutoField()
member_name = models.CharField()
class Permission(models.Model):
permission_no = models.AutoField()
class MemberPermission(models.Model):
member_permission_no = models.AutoField()
member_no = models.ForeignKey(
Member, related_name='members', on_delete=models.CASCADE,
)
permission_no = models.ForeignKey(
Permission, related_name='member_permissions', on_delete=models.CASCADE,
)
my_permission = Member.objects.prefetch_related('member_permissions').filter(Q(member_permissions__isnull=False))[:1]
print(my_permission[0].member_permissions)
# member outer join permission, single query at django 2.X
# member outer join permission & additional query at django 3.x
my_permission = Member.objects.filter(Q(member_permissions__isnull=False))[:1]
print(my_permission[0].member_permissions)
# member outer join permission, single query at django 3.X
my_permission = Member.objects.prefetch_related(
Prefetch('member_permissions', MemberPermission.objects.select_related(
'permission_no').all())
).filter(Q(members__isnull=False))[:1]
print(my_permission[0].member_permissions.all()[0].permission_no.permission_no)
# member outer join permission & additional query at django 3.x
If I don't use prefetch_related, I could get single query.
But if I want to get model of joined model (Permission of MemberPermission by Member) it couldn't.
I wonder how to query once by Prefetch() in django3.
This isn't a version difference. It's the way prefetch_related works. It will execute 1 extra query per outer join. However, this is still a lot less than executing 1 query per iteration. The documentation is very clear on this:
select_related works by creating an SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a ‘many’ relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.
prefetch_related, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related.
So let's say we have a 2 outer joins and in total 1000 matching rows:
Number of queries without prefetch_related: 1 + 2*1000 = 2001
Number of queries with prefetch_related: 1 + 2 = 3
So it makes very little sense to worry about that 1 extra query per join.
I'm trying to optimise my queries but prefetch_related insists on joining the tables and selecting all the fields even though I only need the list of ids from the relations table.
You can ignore the 4th query. It's not related to the question.
Related Code:
class Contact(models.Model):
...
Groups = models.ManyToManyField(ContactGroup, related_name='contacts')
...
queryset = Contact.objects.all().prefetch_related('Groups')
Django 1.7 added Prefetch objects which let you customise the queryset used when prefetching.
In particular, see only().
In this case, you'd want something like:
queryset = Contact.objects.all().prefetch_related(
Prefetch('Groups', queryset=Group.objects.all().only('id')))
I have following models:
class Product(models.Model):
name = CharField(max_length=30)
class Store(models.Model):
name = CharField(max_length=30)
product = models.ManyToManyField(Product)
How to get Stores with product named product_name and also, get all the products (except the product with name product_name) ? Is it possible to make it in one query?
In raw SQL it would be simple JOINs. Not sure how to implement it via Django.
You can actually do these things with Django due to it's lazy queryset evaluation. Django's in field lookup accepts both lists and querysets. The following will create a nested SQL code:
products = Product.objects.filter(store_set__in=stores_qs)
stores_qs = Store.objects.filter(product__name='product_name')
Here are the Django in docs.
You should be able to filter the stores based on an attribute of Product, and then prefetch_related of the retrieved objects.
Store.objects.filter(product__name="product_name").prefetch_related('product')
This should hit the database the fewest times to achieve what you are looking for - twice.
Further documentation can be found here.
Get Stores with product named "product_name" :
Store.objects.filter(product__name='product_name')
Get all the products except the product with name "product_name":
Product.objects.exclude(name='product_name')
need to get a queryset with the first book (by a date field) for each author (related to by foreign key) ...is there a Django ORM way to do this (without custom SQL preferred but acceptable)
*Edit: Please note that an answer that works using only a modern open source backend like Postgresql is acceptable ..still ORM based solution preferred over pure custom sql query)
Models
class Book(Model):
date = Datefield()
author = ForeignKey(Author)
class Author(Model):
name = CharField()
Book.objects.filter(??)
If you use PostgreSQL or another DB backend with support for DISTINCT ON there is a nice solution:
Books.objects.order_by('author', '-date').distinct('author')
Otherwise I don't know a solution with only one query. But you can try this:
from django.db.models import Q, Max
import operator
books = Book.objects.values('author_id').annotate(max_date=Max('date'))
filters = reduce(operator.or_, [(Q(author_id=b['author_id']) &
Q(date=b['max_date'])) for b in books])
queryset = Books.objects.filter(filters)
With the combination of .values() and .annotate() we group by the author and annotate the latest date of all books from that author. But the result is a dict and not a queryset.
Then we build a SQL statement like WHERE author_id=X1 AND date=Y1 OR author_id=X2 AND date=Y2.... Now the second query is easy.
I run a lab annotation website where users can annotate samples with tags relating to disease, tissue type, etc. Here is a simple example from models.py:
from django.contrib.auth.models import User
from django.db import models
class Sample(models.Model):
name = models.CharField(max_length = 255)
tags=models.ManyToManyField('Tag', through = 'Annot')
class Tag(models.Model):
name = models.CharField(max_length = 255)
class Annot(models.Model):
tag = models.ForeignKey('Tag')
sample = models.ForeignKey('Sample')
user = models.ForeignKey(User, null = True)
date = models.DateField(auto_now_add = True)
I'm looking for a query in django's ORM which will return the tags in which two users agree on the annotation of same tag. It would be helpful if I could supply a list of users to limit my query (if someone only believes User1 and User2 and wants to find the sample/tag pairs that only they agree on.)
I think I understood what you need. This one made me think, thanks! :-)
I believe the equivalent SQL query would be something like:
select t.name, s.name, count(user_id) count_of_users
from yourapp_annot a, yourapp_tag t, yourapp_sample s
where a.tag_id = t.id
and s.id = a.sample_id
group by t.name, s.name
having count_of_users > 1
While I try hard not to think in SQL when I'm coming up with django model navigation (it tends to get in the way); when it comes to aggregation queries it always helps me to visualize what the SQL would be.
In django we now have aggregations.
Here is what I came up with:
models.Annot.objects.select_related().values(
'tag__name','sample__name').annotate(
count_of_users=Count('user__id')).filter(count_of_users__gt=1)
The result set will contain the tag, the sample, and the count of users that tagged said sample with said tag.
Breaking it apart for the folks that are not used to django aggregation:
models.Annot.objects.select_related()
select_related() is forcing all tables related to Annot to be retrieved in the same query
This is what will allow me to specify tag__name and sample__name in the values() call
values('tag__name','sample__name')
values() is limiting the fields to retrieve to tag.name and sample.name
This makes sure that my aggregation on count of clients will group by just these fields
annotate(count_of_users=Count('user__id'))
annotate() adds an aggregation as an extra field to a query
filter(count_of_users__gt=1)
And finally I filter on the aggregate count.
If you want to add an additional filter on what users should be taken into account, you need to do this:
models.Annot.objects.filter(user=[... list of users...]).select_related().values(
'tag__name','sample__name').annotate(
count_of_users=Count('user__id')).filter(count_of_users__gt=1)
I think that is it.
One thing... Notice that I used tag__name and sample__name in the query above. But your models do not specify that tag names and sample names are unique.
Should they be unique? Add a unique=True to the field definitions in the models.
Shouldn't they be unique? You need to replace tag__name and sample__name with tag__id and sample__id in the query above.