Weird behavior in Django queryset union of values - django

I want to join the sum of related values from users with the users that do not have those values.
Here's a simplified version of my model structure:
class Answer(models.Model):
person = models.ForeignKey(Person)
points = models.PositiveIntegerField(default=100)
correct = models.BooleanField(default=False)
class Person(models.Model):
# irrelevant model fields
Sample dataset:
Person | Answer.Points
------ | ------
3 | 50
3 | 100
2 | 100
2 | 90
Person 4 has no answers and therefore, points
With the query below, I can achieve the sum of points for each person:
people_with_points = Person.objects.\
filter(answer__correct=True).\
annotate(points=Sum('answer__points')).\
values('pk', 'points')
<QuerySet [{'pk': 2, 'points': 190}, {'pk': 3, 'points': 150}]>
But, since some people might not have any related Answer entries, they will have 0 points and with the query below I use Coalesce to "fake" their points, like so:
people_without_points = Person.objects.\
exclude(pk__in=people_with_points.values_list('pk')).\
annotate(points=Coalesce(Sum('answer__points'), 0)).\
values('pk', 'points')
<QuerySet [{'pk': 4, 'points': 0}]>
Both of these work as intended but I want to have them in the same queryset so I use the union operator | to join them:
everyone = people_with_points | people_without_points
Now, for the problem:
After this, the people without points have their points value turned into None instead of 0.
<QuerySet [{'pk': 2, 'points': 190}, {'pk': 3, 'points': 150}, {'pk': 4, 'points': None}]>
Anyone has any idea of why this happens?
Thanks!

I should mention that I can fix that by annotating the queryset again and coalescing the null values to 0, like this:
everyone.\
annotate(real_points=Concat(Coalesce(F('points'), 0), Value(''))).\
values('pk', 'real_points')
<QuerySet [{'pk': 2, 'real_points': 190}, {'pk': 3, 'real_points': 150}, {'pk': 4, 'real_points': 0}]>
But I wish to understand why the union does not work as I expected in my original question.
EDIT:
I think I got it. A friend instructed me to use django-debug-toolbar to check my SQL queries to investigate further on this situation and I found out the following:
Since it's a union of two queries, the second query annotation is somehow not considered and the COALESCE to 0 is not used. By moving that to the first query it is propagated to the second query and I could achieve the expected result.
Basically, I changed the following:
# Moved the "Coalesce" to the initial query
people_with_points = Person.objects.\
filter(answer__correct=True).\
annotate(points=Coalesce(Sum('answer__points'), 0)).\
values('pk', 'points')
# Second query does not have it anymore
people_without_points = Person.objects.\
exclude(pk__in=people_with_points.values_list('pk')).\
values('pk', 'points')
# We will have the values with 0 here!
everyone = people_with_points | people_without_points

Related

Better way to make dictionary of counts from a QuerySet based on a field in Django?

I have a queryset of objects that could have any text value in their field and I would like a dictionary of the counts of the values in that field. For example:
I have a Test Drives Object which has a foreign key to vehicle, which has a make (Char field). I would like to know how many Test Drives each make had. I have actually solved it already, but I wonder if there is a better way using Django's inbuilt functionality.
My existing, working solution:
customer_test_drives_makes = customer_test_drives.values_list('vehicle__make', flat=True).order_by('vehicle__make').distinct()
customer_test_drives_makes_dictionary = {}
for make in customer_test_drives_makes :
customer_test_drives_makes_dictionary[make] = customer_test_drives.filter(vehicle__make=make).count()
print(customer_test_drives_makes_dictionary)
This prints:{'BMW': 1, 'Honda': 1, 'Hyundai': 1, 'Mazda': 2} Which is correct
There is. Try group by using annotate:
from django.db.models import Count
customer_test_drives.values('vehicle__make').annotate(count=Count('vehicle__make')).values()
Another example: Suppose your model looks like this:
class Cars(models.Model):
vehicle = models.ForeignKey(Vehicle, on_delete=models.DO_NOTHING)
Then you can run this queryset as well:
Vehicle.objects.annotate(car_count=Count('cars'))
Ruddra's answer got me most of the way, but I had to make the values distinct. Here are the final solutions:
Pre-existing way:
queryset_makes = queryset.values_list('vehicle__make', flat=True).order_by('vehicle__make').distinct()
queryset_makes_dictionairy = {}
for make in queryset_makes :
queryset_makes_dictionairy[make] = queryset.filter(vehicle__make=make).count()
print(queryset_makes_dictionairy)
Result:
{'BMW': 1, 'Honda': 4, 'Hyundai': 1, 'Mazda': 2, 'Mitsubishi': 1, 'Nissan': 2}
------------
Updated Way:
print(queryset.values('vehicle__make').order_by('vehicle__make').distinct().annotate(count=Count('vehicle__make')))
Result:
<QuerySet [{'vehicle__make': 'BMW', 'count': 1}, {'vehicle__make': 'Honda', 'count': 4}, {'vehicle__make': 'Hyundai', 'count': 1}, {'vehicle__make': 'Mazda', 'count': 2}, {'vehicle__make': 'Mitsubishi', 'count': 1}, {'vehicle__make': 'Nissan', 'count': 2}]>

Django group by add non existing choices

I have a model field that contains choices:
db_redirection_choices = (('A', 'first'), ('B', 'second'))
redirection_type = models.CharField(max_length=256, choices=db_redirection_choices, blank=True, null=True)
At some point I'm performing a group by on that column, counting all existing choices:
results = stats.values('redirection_type').annotate(amount=Count('redirection_type')).order_by('redirection_type')
However, this will give me only results for existings choices. I'd like to add the ones that are not even present with 0 to the results
e.g. If the table contains only the entry
Id | redirection_type
--------------------------
1 | 'A'
then the annotate will return only
'A': 1
of course that's normal, but I'd still like to get all non-existing choices in the results:
{'A': 1, 'B': 0}
What's the easiest way of accomplishing this?
I don't think there is an easy way to do it with the ORM, except maybe using a conditional expression, but that would make your query a lot more complicated, I think.
Why not do a simple post-processing in Python?
db_redirection_choices = (('A', 'first'), ('B', 'second'))
# I think your queryset will have a similar shape
results = [{'redirection_type': 'A', 'amount': 1}]
results_map = {
**{choice: 0 for choice, _display in db_redirection_choices},
**{res['redirection_type']: res['amount'] for res in results}
}
assert results_map == {'A': 1, 'B': 0}
If you don't need further processing in the ORM, that seems like the easiest.

Select record from a list of ids through M2M relations

First, sorry for my bad english, this problem is not trivial to explain so I hope you will understand me.
I have 2 models as the following:
class A(models.Model):
code = models.CharField(unique=True, max_length=10)
list_of_b = models.ManyToManyField('B')
class B(models.Model):
code = models.CharField(unique=True, max_length=10)
I aim to retrieve instances of A which match exactly with a given list of B ids.
For example, imagine I have the following records of A in my database:
id: 1 - code: X - list_of_b: [1, 2, 4, 6]
id: 2 - code: Y - list_of_b: [2, 5, 6]
id: 3 - code: Z - list_of_b: [2, 3, 4, 5, 6]
With [2, 5, 6] as given list, I should retrieve the record 2 and 3, not 1.
I succeed to retrieve records with an exact match of ids with this query:
queryset = A.objects.prefetch_related('list_of_b')
queryset = queryset.annotate(nb=Count('list_of_b')).filter(nb=len(my_list))
for id in my_list:
queryset = queryset.filter(list_of_b=id)
It works for the record 2 but not for the record 3.
Thanks for any help. Don't hesitate to question me if not clear enough. ;)
EDIT:
Just one more thing: it's also possible that my_list contains more IDs than necessary. For exemple, with [2, 5, 6, 7] I should retrieve records 2 and 3.
Just remove the filter by count:
queryset = A.objects.prefetch_related('list_of_b')
for id in my_list:
queryset = queryset.filter(list_of_b=id)

Django model group as a list

I have a model with test data as below
id days
1, 30
1, 40
2, 10
2, 20
1, 90
I want output as
1, [30,40,90]
2, [10,20]
How can I get this in Django?
It's not much Django, it's pure python. To get the result as a mapping on 'id' as key:
result = {}
for obj in Mymodel.objects.all():
if result.has_key(obj.id):
result[obj.id].append(obj.days)
else:
result[obj.id] = [obj.days]
print result
>>> {1: [30, 40, 90], 2: [10, 20]}
The order of the elements in each list is not defined. If you require these to be ordered, best would be to append .order_by('days') on the Queryset.
A final remark: Your 'id' is not unique. I would consider a non-pk-column named 'id' a bad practice, since 'id' is Django's default name for the automatically created pk-field.

Django group by in many to many relationships

I have a model named Evaluation with following schema:
user = models.ForeignKey(User)
value = models.IntegerField()
The value field will take value in 0,1,2,3.
Now I want to get the count of evaluations of a given user with each value. For example, suppose my data are:
user.id | value
1 | 0
1 | 0
1 | 1
1 | 2
1 | 3
1 | 3
I want to get the result
value | count
0 | 2
1 | 1
2 | 1
3 | 2
I use the query
Evaluation.objects.filter(user=request.user).annotate(count=Count('value')).order_by('value')
But it does not return the correct answer. Could anyone help?
you can do it this way:
Evaluation.objects.filter(user=request.user).values('value').annotate(count=Count('value')).order_by('value')
Add the values() method:
Evaluation.objects.filter(user_id=request.user) \
.values('value').annotate(count=Count('value')) \
.order_by('value')
You could build reverse query and query the User model instead:
User.objects.filter(user=request.user).values('evaluation__value').annotate(count=Count('evaluation__user'))
which will produce below results:
[{'count': 1, 'evaluation__value': 1}, {'count': 1, 'evaluation__value': 2}, {'count': 2, 'evaluation__value': 0}, {'count': 2, 'evaluation__value': 3}]
Additionally you might want to sort the results:
queryset.order_by('-count') # sorts by count desc
Unfortunately you cannot alias the value in values queryset method hence the ugly evaluation__value as field name. See this Django ticket.
HTH.