Django recommends not using null on CharField, however annotate includes empty strings in the count. Is there a way to avoid that without excluding rows with empty string from the query?
My question isn't simly how to achieve my query, but fundamentally, should Annotate/Aggregate count include empty fields or not. Django consider empty as a replacement for NULL for string based fields.
My model :
class Book(models.Model):
name = models.CharField(...)
class Review(models.Model):
book = models.ForeignKey()
category = models.ForeignKey()
review = models.CharField(max_length=200, default='', blank=True)
To count non-empty reviews & group by category, I use
Review.objects.values('category').annotate(count=Count('review'))
This doesn't work because annotate counts empty values also (if the entry was NULL, it wouldn't have done so). I could filter out empty strings before the annotate call but my Query is more complex and I need all empty & non-empty objects.
Is there a smarter way to use annotate and skip empty values from count or should I change the model from
review = models.CharField(max_length=200, default='', blank=True)
to
review = models.CharField(max_length=200, default=None, blank=True, null=True)
I faced a very similar situation. I solved it using Conditional Expressions:
review_count = Case(
When(review='', then=0),
default=1,
output_field=IntegerField(),
)
Review.objects.values('category').annotate(count=review_count)
...and I need all empty & non-empty objects.
This doesn't make any sense when using values. Instead of actual objects, you'll get a list of dictionaries containing just the category and count keys. Apart from a different number in count, you'll see no difference between filtering out empty review values or not. On top of that, you filter for a single book (id=2) and somehow expect that there can be more than one review.
You need to seriously rethink what you are exactly trying to do, and how your model definition fits into that.
Related
I have a set of four models representing rental properties.
The Property model stores the property's name and who created it.
class Property(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
creator = models.ForeignKey(User, related_name='creator', on_delete=models.PROTECT)
name = models.CharField(max_length=100)
A Property can have many Area instances throug a ForeignKey relationship. An Area has an AreaType through a ForeignKey relationship as well, and many Amenities through a ManyToManyField.
class Area(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
type = models.ForeignKey(AreaType, on_delete=models.PROTECT)
property = models.ForeignKey(Property, on_delete=models.PROTECT)
amenities = models.ManyToManyField(Amenity, null=True, blank=True)
class AreaType(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=100)
class Amenity(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=100)
I am familiar with field lookups that do and do not span relationships, as well as filtering through kwargs.
What I am not familiar with is combining field lookups and aggregation functions.
And therefore, given the models above, I would like to:
Retrieve all Properties that have at least 3 Amenities with the name 'Television'.
Retrieve all Properties that have at least 2 Areas with an AreaType name of 'Bathroom'.
To retrieve all properties with ammenities with name Television:
Big thanks to #Waldemar Podsiadło, I misunderstood the question. He provided the correct answer for counting tv's.
Property.objects.filter(area_set__amenities__name="Television")\
.annotate(tv_count=models.Count('area_set__amenities')\
.filter(tv_count__gte=3))
Should retrieve all properties relating to ammenities with name Television, and with a count greater than or equal to 3.
You did not define a related_name, so therefore the reverse of area will be area_set.
From there on, you have the properties available to you as fields, so you can just continue to filter from that.
To access a field in the query, separate it with double underscores __, this also works with expressions, like __count, and __sum (most of the time).
You can figure out what query expressions you can use with:
Property._meta.fields.get_field('your_field_name').get_lookups()
A link to the django-docs to learn more about querysets, and filtering.
A link to the django-docs to learn more about filtering.
Edit: Kwargs example:
def filter_my_queryset(tv_count: dict, *args, **kwargs):
Property.objects.filter(**kwargs)\
.annotate(tv_count=models.Count(*args)\
.filter(**tv_count)) # Need a second dict, unless you always need a gte filter. Then you can dynamically do it.
Or, you could dynamically get the amenities, by comparing the last part of the expression with the fields on a model; if its not a field, cut it out. In this example that should cut out area_set__amenities__name to area_set__amenities
You can get a list of fields from the model with:
field_name_list = []
for i in mymodel._meta.fields:
field_name_list.append(i.name)
Or alternatively; use hasattr() to check if a field exists. This might give false positives though, since any attributes count toward it.
Sidenote:
Read this doc, I think you'll find it very useful if you need to pack a lot of queries into functions. Custom QuerySets are great!
https://docs.djangoproject.com/en/4.1/topics/db/managers/
#nigel239 I think you don't need to use _set in filtering reverse relationship, you just use model name. In addition you can make more with Django ORM. For first question it goes like:
Property.objects.filter(area__amenities__name="Television")
.annotate(tv_ocurance=Count('area__amenities', filter=Q(area__amenities__name="Television")))
.filter(tv_ocurance__gte=3)
you can do exactly the same for second query.
UPDATE:
As #nigel239 suggested it can be done like:
Property.objects.filter(area__amenities__name="Television")\
.annotate(tv_count=Count('area__amenities')\
.filter(tv_count__gte=3)
Or my way:
Property.objects\
.annotate(tv_ocurance=Count('area__amenities', filter=Q(area__amenities__name="Television")))\
.filter(tv_ocurance__gte=3)
Suppose I have an object:
survey = Survey.objects.all().first()
and I want to create a relationship between it and a group of objects:
respondents = Respondent.objects.all()
for r in respondents:
r.eligible_for.add(survey)
where eligible_for is an M2M field in the Respondent model.
Is there any way to do it in one pass, or do I need to loop over the queryset?
models.py
class Questionnaire(models.Model):
label = models.CharField(max_length=48)
organization = models.ForeignKey('Home.Organization', on_delete=models.PROTECT, null=True)
class Respondent(models.Model):
user = models.OneToOneField('Home.ListenUser', on_delete=models.CASCADE)
organization = models.ForeignKey('Home.Organization', on_delete=models.PROTECT)
eligible_for = models.ManyToManyField('Questionnaire', related_name='eligible_users', blank=True)
.add(…) [Django-doc] can take a variable number of items, and will usually add these in bulk.
You can thus add all the rs with:
survey.eligible_users.add(*respondents)
We here thus add the respondents to the relation in reverse. Notice the asterisk (*) in front of respondents that will thus perform iterable unpacking.
https://docs.djangoproject.com/en/2.2/ref/models/relations/
survey.eligible_users.set(respondents) is another way to do this without having to unpack the list.
Below is my post model.
class Post(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
title = models.CharField(max_length=200)
content = models.TextField()
datetime = models.DateTimeField(auto_now_add=True)
votes = models.ManyToManyField(settings.AUTH_USER_MODEL,
related_name="post_votes", default=None, blank=True)
tags = models.ManyToManyField(Tag, default=None, blank=True)
I want to filter posts which contain a certain query in their title, content or as the name of one of their tags. To do this I've tried:
query_set = Post.objects.filter(Q(content__icontains=query)|
Q(tags__name__icontains=query)|
Q(title__icontains=query))
But this often returns QuerySets with duplicate results. I have tried using the distinct method to solve this, but that results in incorrect ordering when I sort the posts later on by the number of votes they have:
query_set.annotate(vote_count=Count('votes')).order_by('-vote_count', '-datetime')
If anybody could help me I would be very grateful.
Jack
The duplicates originate from the fact that you filter on related objects. This means that Django will perform a query with a JOIN in it. You can of course perform a uniqness filter at the Django/Python level, but those are inefficient (well the ineffeciency is two-fold: first it will result in more data being transmitted from the database to the Django server, and furthermore Python does not handle large collections very well).
Furthermore the line:
query_set.annotate(vote_count=Count('votes')).order_by('-vote_count', '-datetime')
is basically a no-op, since QuerySets are immutable, here you did not sort the QuerySet on votes, you constructed a new one that will do that, but you immediately throw it away, since you do nothing with the result.
You can add the annotation and ordering and thus obtain distinct results later on:
query_set = Post.objects.filter(
Q(content__icontains=query)|
Q(tags__name__icontains=query)|
Q(title__icontains=query)
).annotate(
vote_count=Count('votes', distinct=True)
).order_by('-vote_count', '-date_time').distinct()
The distinct=True on the Count is necessary, since, as said before, the query acts like a JOIN, and JOINs can act like "multipliers" when counting things, since a row can occur multiple times.
I have two models:
class BookSeries(models.Model):
title = models.CharField(max_length=200, null=False, blank=False, unique=True)
#extra fields
class Book(models.Model):
series = models.ForeignKey(BookSeries, blank=True, null=True, default=None)
publisher = models.ForeignKey(Publisher, default=None, null=True, blank=True)
title = models.CharField(max_length=200, null=False, blank=False, unique=True)
#extra fields
Now I want to query all the books which doesn't belong to a series and only one of any of the book which belong to the same series (series can be null).
Problem statement:
I wan to query all the individual books and series.
Since a series can have multiple books, and a book may not belong to a series. One of the solutions is to query all the book objects ( which doesn't belong to a series) and query all the series objects as described here. But this would give all series together and books together in the response. I don't want them to be grouped together (I am also using pagination).
something like : Book.objects.filter( disctinct only if(series is not None))
I thought of using distinct and exclude but couldn't make it work.
I would suggest following approach:
Get id of all the books which doesn't belong to a series:
ids_list1 = list(Book.objects.filter(series=None).values_list('id', flat=True))
Get id of all the books which belongs to a series and get only first using distinct:
ids_list2 = list(Book.objects
.exclude(series=None) # exclude ones which are not in a series
.order_by('series') # order by series
.distinct('series') # keep the first book in each series
.values_list('id', flat=True))
Now, you can combine these two lists and make another query to return only the books with these ids:
ids = id_list1 + id_list2
books = Book.objects.filter(id__in=ids)
First exclude all if series is None, then call distinct() return you a list.
Book.objects.exclude(series=None).distinct('series')
If you need to exclude null values and empty strings, the preferred way to do so is to chain together the conditions like so:
Book.objects.exclude(series__isnull=True).exclude(series__exact='')
You Can follow this thread for better understanding Filtering for empty or NULL names in a queryset
I have one of the below example model
class Title(models.Model):
name = models.CharField(max_length=200)
provider_name = models.CharField(max_length=255, blank=True)
keywords = models.CharField(max_length=200, null=True, blank=True)
def __unicode__(self):
return '{0} - {1} (name/provider)'.format(name, provider_name)
so in order to order_by the Title model queryset with any model field, we can just do
titles = Title.objects.all().order_by('name')
But is it possible to order_by the queryset with particular value ? i mean i want to order_by the Title model queryset with the return value of unicode method, i.e., the combination of name and provider_name('{0} - {1} (name/provider)'.format(name, provider_name))
So overall instead of doing order_by with Model fields/database columns, i want to order by a value(Return value of unicode method in this case)
Is that possible to order_by the queryset with a value in ORM or else we need to write raw sql in order to achieve this ?
No, it's not possible to use a method in your filter because the ORM cannot translate it into SQL. In your case, if what you want is to order by name then by provider name, ordering by unicode method (even if it was possible) may give you wrong results since the names are not all the same length.
Using raw SQL for things Django can do is not a good idea either, so I think the best way to do it is:
titles = Title.objects.order_by('name', 'provider_name')