Django- Remove duplicates only if a field is present - django

I have two models:
class BookSeries(models.Model):
title = models.CharField(max_length=200, null=False, blank=False, unique=True)
#extra fields
class Book(models.Model):
series = models.ForeignKey(BookSeries, blank=True, null=True, default=None)
publisher = models.ForeignKey(Publisher, default=None, null=True, blank=True)
title = models.CharField(max_length=200, null=False, blank=False, unique=True)
#extra fields
Now I want to query all the books which doesn't belong to a series and only one of any of the book which belong to the same series (series can be null).
Problem statement:
I wan to query all the individual books and series.
Since a series can have multiple books, and a book may not belong to a series. One of the solutions is to query all the book objects ( which doesn't belong to a series) and query all the series objects as described here. But this would give all series together and books together in the response. I don't want them to be grouped together (I am also using pagination).
something like : Book.objects.filter( disctinct only if(series is not None))
I thought of using distinct and exclude but couldn't make it work.

I would suggest following approach:
Get id of all the books which doesn't belong to a series:
ids_list1 = list(Book.objects.filter(series=None).values_list('id', flat=True))
Get id of all the books which belongs to a series and get only first using distinct:
ids_list2 = list(Book.objects
.exclude(series=None) # exclude ones which are not in a series
.order_by('series') # order by series
.distinct('series') # keep the first book in each series
.values_list('id', flat=True))
Now, you can combine these two lists and make another query to return only the books with these ids:
ids = id_list1 + id_list2
books = Book.objects.filter(id__in=ids)

First exclude all if series is None, then call distinct() return you a list.
Book.objects.exclude(series=None).distinct('series')
If you need to exclude null values and empty strings, the preferred way to do so is to chain together the conditions like so:
Book.objects.exclude(series__isnull=True).exclude(series__exact='')
You Can follow this thread for better understanding Filtering for empty or NULL names in a queryset

Related

Using ForeignKey to sort with order_by and distinct not working

I'm trying to sort model Game by each title and most recent update(post) without returning duplicates.
views.py
'recent_games': Game.objects.all().order_by('title', '-update__date_published').distinct('title')[:5],
The distinct method on the query works perfectly however the update__date_published doesn't seem to be working.
models.py
Model - Game
class Game(models.Model):
title = models.CharField(max_length=100)
slug = models.SlugField(unique=True)
description = models.TextField()
date_published = models.DateTimeField(default=timezone.now)
cover = models.ImageField(upload_to='game_covers')
cover_display = models.ImageField(default='default.png', upload_to='game_displays')
developer = models.CharField(max_length=100)
twitter = models.CharField(max_length=50, default='')
reddit = models.CharField(max_length=50, default='')
platform = models.ManyToManyField(Platform)
def __str__(self):
return self.title
Model - Update
class Update(models.Model):
author = models.ForeignKey(User, models.SET_NULL, blank=True, null=True,) # If user is deleted keep all updates by said user
article_title = models.CharField(max_length=100, help_text="Use format: Release Notes for MM/DD/YYYY")
content = models.TextField(help_text="Try to stick with a central theme for your game. Bullet points is the preferred method of posting updates.")
date_published = models.DateTimeField(db_index=True, default=timezone.now, help_text="Use date of update not current time")
game = models.ForeignKey(Game, on_delete=models.CASCADE)
article_image = models.ImageField(default='/media/default.png', upload_to='article_pics', help_text="")
platform = ChainedManyToManyField(
Platform,
horizontal=True,
chained_field="game",
chained_model_field="game",
help_text="You must select a game first to autopopulate this field. You can select multiple platforms using Ctrl & Select (PC) or ⌘ & Select (Mac).")
See this for distinct reference Examples (those after the first will only work on PostgreSQL)
See this one for Reverse Query - See this one for - update__date_published
Example -
Entry.objects.order_by('blog__name', 'mod_date').distinct('blog__name', 'mod_date')
Your Query-
Game.objects.order_by('title', '-update__date_published').distinct('title')[:5]
You said:
The -update__date_published does not seem to be working as the Games are only returning in alphabetical order.
The reason is that the first order_by field is title; the secondary order field -update__date_published would only kick in if you had several identical titles, which you don't because of distinct().
If you want the Game objects to be ordered by latest update rather their title, omitting title from the ordering seems the obvious solution until you get a ProgrammingError that DISTINCT ON field requires field at the start of the ORDER BY clause.
The real solution to sorting games by latest update is:
games = (Game.objects
.annotate(max_date=Max('update__date_published'))
.order_by('-update__date_published'))[:5]
The most probable misunderstanding here is the join in your orm query. They ussually lazy-loading, so the date_published field is not yet available, yet you are trying to sort against it. You need the select_related method to load the fk relation as a join.
'recent_games': Game.objects.select_related('update').all().order_by('title', '-update__date_published').distinct('title')[:5]

django-filters using ModelChoiceFilter to get value of ForeignKey

I'm trying to use ModelChoiceFilter to filter a database of letters based on the author. Author is a ForeignKey, and I can't seem to get it to display the "name" value of the ForeignKey.
Here is what I have:
models.py (limited to relevant bits)
class Person(models.Model):
name = models.CharField(max_length=250, verbose_name='Full Name')
...
def __str__(self):
return self.name
class Letter(models.Model):
author = models.ForeignKey(Person, related_name='author', on_delete=models.PROTECT, verbose_name='Letter Author')
recipient = models.ForeignKey(Person, related_name='recipient', on_delete=models.PROTECT, verbose_name='Recipient')
...
title = models.CharField(max_length=250, verbose_name='Title of Letter')
def __str__(self):
return self.title
letter_filters.py
class LetterFilter(django_filters.FilterSet):
...
author = django_filters.ModelChoiceFilter(queryset=Letter.objects.order_by('author__name'))
class Meta:
model = Letter
fields = ['author', 'recipient']
I can see that this kind of works. It is indeed limiting and ordering it properly, but instead of the author name being presented in the select box, it's presenting "title" from the letter (but I can tell from the title, in the proper order).
What I thought should work is this:
fields = ['author__name', 'recipient']
But that too continues to list "title" from Letter instead of "name" from Person.
I know it has what I need, because if I do:
author = django_filters.ModelChoiceFilter(queryset=Letter.objects.order_by('author__name').values('author__name'))
I get exactly what I want! But, it's presented as {'author__name':'Jane Doe'} with fields author or author_name. I just can't seem to get the right syntax.
Finally, I know I can do:
author = django_filters.ModelChoiceFilter(queryset=Person.objects.order_by('name'))
Which returns all Persons, properly ordered. However there are many more persons in the database than just authors. This is the same result as just allowing the default fields['author'... without setting the author= in the class (though unordered).
Well the queryset you specify deals with Letters, so as a result the Letters are in that cases added in the ModelChoiceFiler, which is not ideal at all.
You can however generate a list of Persons that has written at least one letter like:
django_filters.ModelChoiceFilter(
queryset=Person.objects.filter(letter_set__isnull=False).order_by('name').distinct()
)
So here we filter on the fact that the letter_set is not empty, and since this will result in a JOIN where a Person can occur multiple times, we add .distinct() to it.
I find this modeling however very weird (in your three examples). It basically means that you only can assign Persons that already wrote a Letter. What if a person that has never written a Letter wants to write a Letter?
Usually in case there are different such roles, you can for example add a BooleanField:
class Person(models.Model):
name = models.CharField(max_length=250, verbose_name='Full Name')
is_author = models.BooleanField(verbose_name='Is the person an author')
# ...
Then we can filter on Persons that are Authors:
django_filters.ModelChoiceFilter(
queryset=Person.objects.filter(is_author=True).order_by('name')
)

Django unique_together on primary key

Currently I have three models:
class Tutorial(models.Model):
title = models.CharField(max_length=100)
description = models.TextField()
videos = models.ManyToManyField('TutorialVideo', through='TutorialVideoThrough', blank=True)
class TutorialVideo(models.Model):
name = models.CharField(max_length=100)
video = S3DirectField(dest='tutorial_vids')
length = models.CharField("Length (hh:mm:ss)",max_length=100)
tutorials = models.ManyToManyField(Tutorial, through='TutorialVideoThrough', blank=True)
class TutorialVideoThrough(models.Model):
tutorial = models.ForeignKey(Tutorial, related_name='tutorial_video_through', blank=True, null=True)
video = models.ForeignKey(TutorialVideo, related_name='video_tutorial_through', blank=True, null=True)
order = models.IntegerField()
A Tutorial can have many TutorialVideos through TutorialVideoThrough. On the TutorialVideoThrough, I have an order field to decide what order to show the videos in. Is there a way to validate that there are no duplicate order integers? For example, if I am linking two TutorialVideos to a Tutorial through TutorialVideoThrough, and I put the first one as order 1, then I shouldn't be able to give the second video as order 1.
I tried using unique_together = ('id', 'order') on TutorialVideoThrough, but it didn't work.
primary_key is always unique, so your unique_together pair also will be always unique.
If you must ensure that there are no videos with same order, you must first answer question: regarding to what?
If you want to have your uniqueness in list of TutorialVideos in Tutorials, your unique_together should be:
unique_together = (('tutorial', 'order'),)
If you want uniqueness in list of Tutorials in TutorialVideos, your unique_together should be:
unique_together = (('video', 'order'),)
But setting order as unique is not good idea, you will have some issues when trying to reorder your fields.
If you want each order to be unique for each tutorial, then you want
unique_together = (
('tutorial', 'order'),
)
You probably want each video to appear only once in each tutorial. If that's the case, add ('tutorial', 'video') to unique_together.
unique_together = (
('tutorial', 'order'),
('tutorial', 'video'),
)
Note that you should only define the many to many relation on one side of the model. If you have
class Tutorial(models.Model):
videos = models.ManyToManyField('TutorialVideo', through='TutorialVideoThrough', blank=True, related_name='tutorials')
then you should remove the tutorial field from the TutorialVideo model.

Django : Count only non-empty CharField with annotate() & values()

Django recommends not using null on CharField, however annotate includes empty strings in the count. Is there a way to avoid that without excluding rows with empty string from the query?
My question isn't simly how to achieve my query, but fundamentally, should Annotate/Aggregate count include empty fields or not. Django consider empty as a replacement for NULL for string based fields.
My model :
class Book(models.Model):
name = models.CharField(...)
class Review(models.Model):
book = models.ForeignKey()
category = models.ForeignKey()
review = models.CharField(max_length=200, default='', blank=True)
To count non-empty reviews & group by category, I use
Review.objects.values('category').annotate(count=Count('review'))
This doesn't work because annotate counts empty values also (if the entry was NULL, it wouldn't have done so). I could filter out empty strings before the annotate call but my Query is more complex and I need all empty & non-empty objects.
Is there a smarter way to use annotate and skip empty values from count or should I change the model from
review = models.CharField(max_length=200, default='', blank=True)
to
review = models.CharField(max_length=200, default=None, blank=True, null=True)
I faced a very similar situation. I solved it using Conditional Expressions:
review_count = Case(
When(review='', then=0),
default=1,
output_field=IntegerField(),
)
Review.objects.values('category').annotate(count=review_count)
...and I need all empty & non-empty objects.
This doesn't make any sense when using values. Instead of actual objects, you'll get a list of dictionaries containing just the category and count keys. Apart from a different number in count, you'll see no difference between filtering out empty review values or not. On top of that, you filter for a single book (id=2) and somehow expect that there can be more than one review.
You need to seriously rethink what you are exactly trying to do, and how your model definition fits into that.

Django: Filter in multiple models linked via ForeignKey?

I'd like to create a filter-sort mixin for following values and models:
class Course(models.Model):
title = models.CharField(max_length=70)
description = models.TextField()
max_students = models.IntegerField()
min_students = models.IntegerField()
is_live = models.BooleanField(default=False)
is_deleted = models.BooleanField(default=False)
teacher = models.ForeignKey(User)
class Session(models.Model):
course = models.ForeignKey(Course)
title = models.CharField(max_length=50)
description = models.TextField(max_length=1000, default='')
date_from = models.DateField()
date_to = models.DateField()
time_from = models.TimeField()
time_to = models.TimeField()
class CourseSignup(models.Model):
course = models.ForeignKey(Course)
student = models.ForeignKey(User)
enrollment_date = models.DateTimeField(auto_now=True)
class TeacherRating(models.Model):
course = models.ForeignKey(Course)
teacher = models.ForeignKey(User)
rated_by = models.ForeignKey(User)
rating = models.IntegerField(default=0)
comment = models.CharField(max_length=300, default='')
A Course could be 'Discrete mathematics 1'
Session are individual classes related to a Course (e.g. 1. Introduction, 2. Chapter I, 3 Final Exam etc.) combined with a date/time
CourseSignup is the "enrollment" of a student
TeacherRating keeps track of a student's rating for a teacher (after course completion)
I'd like to implement following functions
Sort (asc, desc) by Date (earliest Session.date_from), Course.Name
Filter by: Date (earliest Session.date_from and last Session.date_to), Average TeacherRating (e.g. minimum value = 3), CourseSignups (e.g. minimum 5 users signed up)
(these options are passed via a GET parameters, e.g. sort=date_ascending&f_min_date=10.10.12&...)
How would you create a function for that?
I've tried using
denormalization (just added a field to Course for the required filter/sort criterias and updated it whenever changes happened), but I'm not very satisfied with it (e.g. needs lots of update after each TeacherRating).
ForeignKey Queries (Course.objects.filter(session__date_from=xxx)), but I might run into performance issues later on..
Thanks for any tipp!
In addition to using the Q object for advanced AND/OR queries, get familiar with reverse lookups.
When Django creates reverse lookups for foreign key relationships. In your case you can get all Sessions belonging to a Course, one of two ways, each of which can be filtered.
c = Course.objects.get(id=1)
sessions = Session.objects.filter(course__id=c.id) # First way, forward lookup.
sessions = c.session_set.all() # Second way using the reverse lookup session_set added to Course object.
You'll also want to familiarize with annotate() and aggregate(), these allow you you to calculate fields and order/filter on the results. For example, Count, Sum, Avg, Min, Max, etc.
courses_with_at_least_five_students = Course.objects.annotate(
num_students=Count('coursesignup_set__all')
).order_by(
'-num_students'
).filter(
num_students__gte=5
)
course_earliest_session_within_last_240_days_with_avg_teacher_rating_below_4 = Course.objects.annotate(
min_session_date_from = Min('session_set__all')
).annotate(
avg_teacher_rating = Avg('teacherrating_set__all')
).order_by(
'min_session_date_from',
'-avg_teacher_rating'
).filter(
min_session_date_from__gte=datetime.now() - datetime.timedelta(days=240)
avg_teacher_rating__lte=4
)
The Q is used to allow you to make logical AND and logical OR in the queries.
I recommend you take a look at complex lookups: https://docs.djangoproject.com/en/1.5/topics/db/queries/#complex-lookups-with-q-objects
The following query might not work in your case (what does the teacher model look like?), but I hope it serves as an indication of how to use the complex lookup.
from django.db.models import Q
Course.objects.filter(Q(session__date__range=(start,end)) &
Q(teacher__rating__gt=3))
Unless absolutely necessary I'd indeed steer away from denormalization.
Your sort question wasn't entirely clear to me. Would you like to display Courses, filtered by date_from, and sort it by Date, Name?