Django annotate Avg of foreign model - django

Two models, article and review, relationship is one to many (one article has many reviews). Some articles don't have any review.
I want to order articles by review ratings, therefore I use the annotate with AVG:
ArticleQueryset.annotate(rating=models.Avg('reviews__rating')).order_by('-rating')
The issue is that the articles without reviews the rating value is False and somehow that comes before the maximum rating. The result is that the first results don't have any rating, then the highest rated articles show up.

Use nulls_last=True in order_by() method as
ArticleQueryset.annotate(
rating=models.Avg('reviews__rating')
).order_by(models.F('rating').desc(nulls_last=True))

Related

django query with filtered annotations from related table

Take books and authors models for example with books having one or more authors. Books having cover_type and authors having country as origin.
How can I list all the books with hard cover, and authors only if they're from from france?
Books.objects.filter(cover_type='hard', authors__origin='france')
This query doesnt retrieve books with hard cover but no french author.
I want all the books with hard cover, this is predicate #1.
And if their authors are from France, I want them annotated, otherwise authors field may be empty or 'None'.
e.g.:
`
Bookname, covertype, origin
The Trial, hardcover, none
Madam Bovary, hardcover, France
`
Tried many options, annotate, Q, value, subquery, when, case, exists but could come up with a solution.
With sql this is so easy:
select * from books b left join authors a on a.bookref=b.id and a.origin=france where b.covertype='hard'
(my models are not books and authors, i picked them because they're django-docs' example models. my models are building and buildingtype, where i want building.id=454523 with buildigtype where buildingtype is active, buildingtype might be null for the building or only 1 active and zero or more passive)
You should use Book id in Auther table.then your query will be like this: Author.objects.filter(origin="france",book__cover_type="hard")
I think i solved it with subquery, outerref, exists, case, when, charfield...too many imports for a simple sql.
`
author = Authors.objects.filter(bookref=OuterRef('id'), origin='France').values('origin')
books = Books.objects.filter(cover_type='hard').annotate(author=Case(When(Exists(author), then=Subquery(author)), default='none', output_field=CharField())).distinct().values('name','cover_type','author')
`

How do I do a Django query on a Model, where I order by field A, but filter distinct on field B?

Suppose I have a Book table, where each book has an author field and a publishing date field.
I would like to get the latest book by each author. I'm using PostgreSQL as the backend.
The obvious (and wrong) solution would be:
Book.objects.order_by("author", "-published_on").distinct("author").all()
The problem is that while the result contains only one book from each author, then there is no guarantee that it is the latest book. This might be because I'm using random UUIDs as PKs. I can't change that. That's a requirement.
The next obvious (and wrong) solution would be:
Book.objects.order_by("author", "-published_on").distinct("author", "published_on").all()
Here the ordering of the books is correct, but we get multiple books from the same author.
I have also tried flipping around the arguments:
Book.objects.order_by("-published_on", "author").distinct("published_on", "author").all()
Here the ordering of the books is correct, but we get multiple books from the same author.
How do I do a Django ORM query, where I get the latest book from each author?
EDIT: Here's a query I'm actually running on our live DB, before translating it into the book-style example:
from db.models import User, EventVisibility
user = User.objects.get(username="7g8jltdzbz46ak7nhuz8tzfuu7y9mdym7tiy7klfxjnn")
evs = EventVisibility.objects.filter(user=user).order_by("room", "-created_on").distinct("room")[:20]
for ev in evs:
print(f"book_id={ev.room.room_id}, published_on={ev.created_on}")
And these are the results:
book_id=2mcnhajfwf5jsgyzpqix36ytbjfucn9u6derkyurlfff, published_on=2020-05-16 00:54:05.083477+00:00
book_id=4rp9ffxqr5marnphbtlahqtwnkzozupyb8ht532ffxl6, published_on=2020-05-12 20:29:31.286095+00:00
book_id=5dqygkksrzq6ay49xxcspagma5cbz8p59sjcavf6pepm, published_on=2020-05-08 09:28:53.508563+00:00
book_id=9mz85qcxreaczcnenebcywqqm3scehjhpwlkso7g4jbd, published_on=2020-05-04 10:52:06.396995+00:00
book_id=9sgiiasbvbtat4iahx7bd7ammzwatgfipe8wmzl9snz5, published_on=2020-05-15 09:00:52.602512+00:00
book_id=b8uvcxuhgjhmvkjjnwkcr5zzj7hrushz2e9mpzkosg8k, published_on=2020-05-08 09:36:47.148885+00:00
book_id=bxif8aal2v4fb3p8wsdvdard5p65ygw8j92tnleqqza4, published_on=2020-04-19 02:43:23.819854+00:00
book_id=cgoad7xuwjhxz6hcxctbl5arnnsrjt5osuwmzunmppra, published_on=2020-05-08 09:36:06.944614+00:00
book_id=cztb84akqqde6fvpj2nneqezvmor5gdjh3hpcjnxcz2x, published_on=2020-05-15 10:06:53.054862+00:00
book_id=czxizxptbvxz7jybkxevk2mkmaxykhgakfluud7ffa2b, published_on=2020-05-17 14:54:43.245325+00:00
book_id=dgtze2ri5snrr7nmurvdechydxjd2ph3dd8rugibn2me, published_on=2020-05-05 19:16:45.254928+00:00
book_id=dp9wu8qmdw6prsvx2zwvrnw5akcxv6llcwa2skeadcpx, published_on=2020-04-27 10:58:32.555542+00:00
book_id=duelfazwfiek8jhr4ew7wa9vrzzuyhznzxcrpybmbuww, published_on=2020-05-15 10:06:45.001961+00:00
book_id=dwhqxqfyolggdf5wwwm3su3yq6ffsh5kwwjxj7wtkdbj, published_on=2020-05-15 05:53:01.153492+00:00
book_id=edakxxhqv7w99lukxr23dfugcarddpwj5ea8wx7r5bmd, published_on=2020-04-27 19:49:29.673872+00:00
book_id=evz9biehu88eds7hgcutw6jfktt4fkjznfgozxsu8jtk, published_on=2020-04-20 21:13:01.693752+00:00
book_id=fqnxa3j4vbbaw7fc5hgrumabtfh2phmd3hg7cgm5ayfa, published_on=2020-05-15 10:04:22.322094+00:00
book_id=gkxahh8y7eqtqzxsnjtdpnghxnipi8vx3qugjcrs6t3m, published_on=2020-04-17 02:14:31.219950+00:00
book_id=hdgoxpnmqde8siwdbgfwwtodqk4hzhefyz8pw3esdmem, published_on=2020-05-17 14:46:49.437289+00:00
book_id=jrg6uae5kyvfvjgjhmwvzf45lbtqmgspawbuqzfewnhc, published_on=2020-05-05 09:11:59.334099+00:00
This is the queryset.query:
SELECT DISTINCT ON ("db_eventvisibility"."room_id") "db_eventvisibility"."id", "db_eventvisibility"."event_id", "db_eventvisibility"."user_id", "db_eventvisibility"."room_id", "db_eventvisibility"."unit_id", "db_eventvisibility"."case_id", "db_eventvisibility"."team_id", "db_eventvisibility"."created_on" FROM "db_eventvisibility" WHERE "db_eventvisibility"."user_id" = 7g8jltdzbz46ak7nhuz8tzfuu7y9mdym7tiy7klfxjnn ORDER BY "db_eventvisibility"."room_id" ASC, "db_eventvisibility"."created_on" DESC LIMIT 20
The problem is that while the result contains only one book from each author, then there is no guarantee that it is the latest book. This might be because I'm using random UUIDs as PKs. I can't change that. That's a requirement.
To the best of my knowledge, the result is correct in the sense that per Room, you get indeed the latest EventVisibility, but likely that is not what you want. If you want to sort the Rooms per latest EventVisibility, then you can do that with:
from django.db.models import Max
Room.objects.filter(
eventvisibility__user=user
).order_by(
Max('eventvisibitility__created_on').desc()
)

How to limit prefetch_related data in django

I've two tables brand and a product. each brand has multiple products.
So. I used prefetch_related to get related products for a particular brand with only a minimum product price. but the problem is when I have 2 products with the same price it selects both records so how to limit this?
alternatives_data = Brand.objects.filter(category__category_slug = category_slug).prefetch_related(
Prefetch('products', queryset=Product.objects.annotate(
min_brand_price=Min('brand__products__product_price')
).filter(
product_price=F('min_brand_price')
).order_by('product_id')))
i tried everything but nothing work!
To prevent a query to return multiple records with duplicata in specific columns, use the distinct method.
In your case, add .distinct('price') to the Product queryset inside the prefetch.
There is however one caveat : It is supported on PostgreSQL only.
Documentation

Calculating frequency of occurrence as a percentage of total, in Django queryset

In a Django app, I have a queryset for a data model called Comment. This contains text comments left by users.
Imagine 'n' users commented. What's the fastest way to calculate what % of comments were left by which user?
Right now I'm thinking it's going to be:
Comment.objects.filter(commenter=<username>).count()/Comment.objects.count()
What do you suggest? My objective is to flag people who're commenting too much, in order to screen their accounts for possible spamming. I'd be running this query voluminously, hence the focus on performance.
You should avoid making one query for each user in your database. Instead you can just query the number of comments for each user (or even the top n commenters) with something like:
from django.db.models import Count
total_comments = Comment.objects.count()
# Fetch top 10 commenters, annotated with number of comments, ordered by highest first
User.objects.annotate(num_comments=Count('comment')).order_by('-num_comments')[:10]
for user in users:
percentage = user.num_comments / total_comments
This example assumes you have a User model that your Comment has a foreign key to.
The percentage of total comments doesn't actually matter if you are comparing relative numbers of comments.

Filtering related fields in django's annotate() feature

The Django documentation gives examples for using annotate() to produce aggregate results based on related fields of a QuerySet (i.e., using joins in annotate()).
A simplified example from the docs is for instance Store.objects.annotate(min_price=Min('books__price')), where books is a ManyToMany field of Store to Book and price is a field of Book.
To continue this example, how would I generate an annotated QuerySet of Store objects with the lowest prices not for all books in the store, but just for books with "author='William Shakespeare'"? In other words, how do I filter the related fields used to calculate the aggregate?
The documentation explains how to do this:
Store.objects.filter(books__author='William Shakespeare').annotate(
min_price=Min('books__price'))
As that link notes, the order of filter and annotate matters here - because you want to only count books that match the filter in the annotation, the filter must come first.