I have a model which which has fields like:
class Vehicle(models.Model):
car_name = models.CharField(max_length=255)
car_color = models.CharField(max_length=255)
This model has a lot of duplicate values too, I would like distinct to be shown
The queryset gives an output like:
<QuerySet [{'car_name': 'Audi', 'car_color': 'Red'}, {car_name': 'BMW', 'car_color': 'white'}]>
[edit]I want my output Queryset to have another field, which is like a counter for the ouput. If I get two objects out of the query, then a field called ordinal_count also be sent in queryset.
The output I am looking for example:
<QuerySet [{'car_name': 'Audi', 'car_color': 'Red', 'ordinal_count': 1}, {car_name': 'BMW', 'car_color': 'white', 'ordinal_count': 2}, {car_name': 'Jaguar', 'car_color': 'olive green', 'ordinal_count': 3}]>
This is the query which I wrote:
op = (Vehicle.objects.annotate(ordinal_count=Count("car_name", distinct="car_name")).filter(car_color='white', list_display=True).values("car_name","car_color", "ordinal_count"))
This is not giving me the desired input and is also messing up the filter. Should I even be using annotate?
Then I also wrote another query but it fails to run:
NotImplementedError: annotate() + distinct(fields) is not implemented.
Query is:
count = Vehicle.objects.filter(car_color='white', list_display=True).distinct(
"car_name").annotate(ordinal_count=Count("car_name")).values_list(
"car_name","car_color", "ordinal_count")
[edit 1] Tried the solution by #trigo, but it gives me an output like(where ordinal_count remains 2). I want ordinal_count to be like a counter for the objects which come in a queryset:
<QuerySet [{'car_name': 'Audi', 'car_color': 'Red', 'ordinal_count': 2}, {car_name': 'BMW', 'car_color': 'white', 'ordinal_count': 2}, {car_name': 'Jaguar', 'car_color': 'olive green', 'ordinal_count': 2}]>
[Update]:
qs = (
Vehicle.objects.filter(car_color="white", list_display=True)
.distinct("car_name")
.annotate(ordinal_count=Window(expression=RowNumber(), partition_by=None))
.values("slug", "car_name", "car_color","ordinal_count")
).order_by('car_name')
the only issue here is annotate works before filtering. So the ordinal_count is messing up. Is there a way to be able to filter first?
edit3
Suppose there are 5 objects in total and I have said I want all white color cars, with distinct car_name.
The output after applying filter, it seems like annotation happens after filter, because of the jumbled up ordinal_count:
<QuerySet [{'car_name': 'Audi', 'car_color': 'white', 'ordinal_count': 3}, {car_name': 'BMW', 'car_color': 'white', 'ordinal_count': 2}, {car_name': 'Jaguar', 'car_color': 'white', 'ordinal_count': 5}]>
I do not have a complete example but there is a Window expression (inside you can use RowNumber()) available in django models that could help you:
from django.db.models.expressions import Window
from django.db.models.functions import RowNumber
query = Vehicle.objects.values(...).annotate(row_number=Window(expression=RowNumber()))
https://docs.djangoproject.com/en/4.1/ref/models/expressions/#window-functions
[Assumption] : By duplicate you mean only on the attribute "car_name". However, if you mean duplicate on "car_name" and "color" combined, below queryset shall not work. Let me know if this is the case, shall provide other solution for it.
The below queryset shall give you your desired output.
Vehicle.objects.filter().values("car_name").annotate(ordinal_count=Count("id")).order_by("car_name")
[Note]: If your database is PostgreSQL, you can use distinct() however in other database this is hacky way to achieve the same.
In complex queries, you would need to write a sql query directly.
[EDIT 1]
Ok, so basically ordinal_count is NOT the number of times the car_name has occurred but just the counter. Right?
As per your expected output which you provided, i.e.
<QuerySet [{'car_name': 'Audi', 'car_color': 'Red', 'ordinal_count': 1}, {car_name': 'BMW', 'car_color': 'white', 'ordinal_count': 2}, {car_name': 'Jaguar', 'car_color': 'olive green', 'ordinal_count': 3}]>
As per your sample output above, by ordinal_count you only mean the counter (i.e.) index of the array + 1 ? That is the only conclusion I can reach as per your example. For this you may not need to annotate at all.
from django.db.models import Count, Window, F
Vehicle.objects.distinct('car_name').annotate(ordinal_count=Window(
expression=Count("id"),
partition_by=[F("car_name")]
)).values(
car_name,
car_color,
ordinal_count
).order_by('ordinal_count')
This query results following
[{"car_name": "BMW", "car_color": "black", "ordinal_count": 5}...]
Related
I have a queryset of objects that could have any text value in their field and I would like a dictionary of the counts of the values in that field. For example:
I have a Test Drives Object which has a foreign key to vehicle, which has a make (Char field). I would like to know how many Test Drives each make had. I have actually solved it already, but I wonder if there is a better way using Django's inbuilt functionality.
My existing, working solution:
customer_test_drives_makes = customer_test_drives.values_list('vehicle__make', flat=True).order_by('vehicle__make').distinct()
customer_test_drives_makes_dictionary = {}
for make in customer_test_drives_makes :
customer_test_drives_makes_dictionary[make] = customer_test_drives.filter(vehicle__make=make).count()
print(customer_test_drives_makes_dictionary)
This prints:{'BMW': 1, 'Honda': 1, 'Hyundai': 1, 'Mazda': 2} Which is correct
There is. Try group by using annotate:
from django.db.models import Count
customer_test_drives.values('vehicle__make').annotate(count=Count('vehicle__make')).values()
Another example: Suppose your model looks like this:
class Cars(models.Model):
vehicle = models.ForeignKey(Vehicle, on_delete=models.DO_NOTHING)
Then you can run this queryset as well:
Vehicle.objects.annotate(car_count=Count('cars'))
Ruddra's answer got me most of the way, but I had to make the values distinct. Here are the final solutions:
Pre-existing way:
queryset_makes = queryset.values_list('vehicle__make', flat=True).order_by('vehicle__make').distinct()
queryset_makes_dictionairy = {}
for make in queryset_makes :
queryset_makes_dictionairy[make] = queryset.filter(vehicle__make=make).count()
print(queryset_makes_dictionairy)
Result:
{'BMW': 1, 'Honda': 4, 'Hyundai': 1, 'Mazda': 2, 'Mitsubishi': 1, 'Nissan': 2}
------------
Updated Way:
print(queryset.values('vehicle__make').order_by('vehicle__make').distinct().annotate(count=Count('vehicle__make')))
Result:
<QuerySet [{'vehicle__make': 'BMW', 'count': 1}, {'vehicle__make': 'Honda', 'count': 4}, {'vehicle__make': 'Hyundai', 'count': 1}, {'vehicle__make': 'Mazda', 'count': 2}, {'vehicle__make': 'Mitsubishi', 'count': 1}, {'vehicle__make': 'Nissan', 'count': 2}]>
I am new to Django. I am trying to make this query return count by a group. But it doesn't group data.
notification = AppointmentNotificationGroupAppointment.objects.filter(receiver__notification_group__group=group).values('receiver__notification_group__group', 'sender__status__name').annotate(pcount=Count('sender__status__name', distinct=True))
It returns:
{'receiver__notification_group__group': '841536_123856', 'sender__status__name': 'Pending', 'pcount': 1},
{'receiver__notification_group__group': '841536_123856', 'sender__status__name': 'Pending', 'pcount': 1},
{'receiver__notification_group__group': '841536_123856', 'sender__status__name': 'Confirmed', 'pcount': 1},
{'receiver__notification_group__group': '841536_123856', 'sender__status__name': 'Confirmed', 'pcount': 1}
What am I doing wrong? I want it to return distinct records with them counted by group
You need to call the order_by(...) too
AppointmentNotificationGroupAppointment.objects.filter(receiver__notification_group__group=group).values(
'receiver__notification_group__group',
'sender__status__name').annotate(pcount=Count('sender__status__name', distinct=True)
).order_by('receiver__notification_group__group')
I want to get last 100 records of MyModel order_by('-end_date') and do a SUM annotate on different winner types them
MyModel.objects.all()[:100].order_by('-end_game_time').values('winner').annotate(total=Count('winner'))
result query is as below and I don't have expected groups
<QuerySet [{'winner': 3, 'total': 1}, {'winner': 15, 'total': 1}, 'total': 1}, {'winner': 3, 'total': 1}, {'winner': 5, 'total': 1}, {'winner': 15, 'total': 1}, {'winner': 5, 'total': 1}, {'winner': 3, 'total': 1}, '...(remaining elements truncated)...']>
generated query is like
SELECT "game_mymodel"."winner", COUNT("game_mymodel"."winner") AS "total" FROM "game_mymodel" GROUP BY "game_mymodel"."winner", "game_mymodel"."end_game_time" ORDER BY "game_mymodel"."end_game_time" DESC LIMIT 100
but when I don't have the order_by the result is as I expected
MyModel.objects.all()[:100].values('winner').annotate(total=Count('winner'))
Out[52]: <QuerySet [{'winner': 5, 'total': 43}, {'winner': 1, 'total': 2}, {'winner': 15, 'total': 51}, {'winner': 2, 'total': 42}, {'winner': 3, 'total': 43}]>
and generated query group_by part is different
SELECT "game_mymodel"."winner", COUNT("game_mymodel"."winner") AS "total" FROM "game_mymodel" GROUP BY "game_mymodel"."winner" LIMIT 100
As far as I know it is not possible to achieve what you want to do in single query, what you want in SQL is:
SELECT "game_mymodel"."winner", COUNT("game_mymodel"."winner") AS "total" FROM "game_mymodel" GROUP BY "game_mymodel"."winner" ORDER BY "game_mymodel"."end_game_time" DESC LIMIT 100
which is not a valid sql query, so you need to have a sub-query to select 100 elements and then apply your aggregation on them.
First build the sub-query:
top_100_games = MyModel.objects.order_by('-end_game_time')[:100].only('id').all()
And then use it in main query:
MyModel.objects.filter(id__in=top_100_games).values('winner').annotate(total=Count('winner'))
I want to join the sum of related values from users with the users that do not have those values.
Here's a simplified version of my model structure:
class Answer(models.Model):
person = models.ForeignKey(Person)
points = models.PositiveIntegerField(default=100)
correct = models.BooleanField(default=False)
class Person(models.Model):
# irrelevant model fields
Sample dataset:
Person | Answer.Points
------ | ------
3 | 50
3 | 100
2 | 100
2 | 90
Person 4 has no answers and therefore, points
With the query below, I can achieve the sum of points for each person:
people_with_points = Person.objects.\
filter(answer__correct=True).\
annotate(points=Sum('answer__points')).\
values('pk', 'points')
<QuerySet [{'pk': 2, 'points': 190}, {'pk': 3, 'points': 150}]>
But, since some people might not have any related Answer entries, they will have 0 points and with the query below I use Coalesce to "fake" their points, like so:
people_without_points = Person.objects.\
exclude(pk__in=people_with_points.values_list('pk')).\
annotate(points=Coalesce(Sum('answer__points'), 0)).\
values('pk', 'points')
<QuerySet [{'pk': 4, 'points': 0}]>
Both of these work as intended but I want to have them in the same queryset so I use the union operator | to join them:
everyone = people_with_points | people_without_points
Now, for the problem:
After this, the people without points have their points value turned into None instead of 0.
<QuerySet [{'pk': 2, 'points': 190}, {'pk': 3, 'points': 150}, {'pk': 4, 'points': None}]>
Anyone has any idea of why this happens?
Thanks!
I should mention that I can fix that by annotating the queryset again and coalescing the null values to 0, like this:
everyone.\
annotate(real_points=Concat(Coalesce(F('points'), 0), Value(''))).\
values('pk', 'real_points')
<QuerySet [{'pk': 2, 'real_points': 190}, {'pk': 3, 'real_points': 150}, {'pk': 4, 'real_points': 0}]>
But I wish to understand why the union does not work as I expected in my original question.
EDIT:
I think I got it. A friend instructed me to use django-debug-toolbar to check my SQL queries to investigate further on this situation and I found out the following:
Since it's a union of two queries, the second query annotation is somehow not considered and the COALESCE to 0 is not used. By moving that to the first query it is propagated to the second query and I could achieve the expected result.
Basically, I changed the following:
# Moved the "Coalesce" to the initial query
people_with_points = Person.objects.\
filter(answer__correct=True).\
annotate(points=Coalesce(Sum('answer__points'), 0)).\
values('pk', 'points')
# Second query does not have it anymore
people_without_points = Person.objects.\
exclude(pk__in=people_with_points.values_list('pk')).\
values('pk', 'points')
# We will have the values with 0 here!
everyone = people_with_points | people_without_points
I have a model with test data as below
id days
1, 30
1, 40
2, 10
2, 20
1, 90
I want output as
1, [30,40,90]
2, [10,20]
How can I get this in Django?
It's not much Django, it's pure python. To get the result as a mapping on 'id' as key:
result = {}
for obj in Mymodel.objects.all():
if result.has_key(obj.id):
result[obj.id].append(obj.days)
else:
result[obj.id] = [obj.days]
print result
>>> {1: [30, 40, 90], 2: [10, 20]}
The order of the elements in each list is not defined. If you require these to be ordered, best would be to append .order_by('days') on the Queryset.
A final remark: Your 'id' is not unique. I would consider a non-pk-column named 'id' a bad practice, since 'id' is Django's default name for the automatically created pk-field.