Django query ForeignKey Count() zero - django

I have 3 tables:
Truck with the fields: id, name....
Menu with the fields: id, itemname, id_foodtype, id_truck...
Foodtype with the fields: id, type...
I want to get a summary like:
id name total
10 Alcoholic drink 0
5 Appetizer 11
My problem is to return the results with 0 elements.
I tried an SQL query like this:
SELECT
ft.id, ft.name, COUNT(me.id) total
FROM
foodtype ft LEFT JOIN menu me
ON ft.id = me.id_foodtype
LEFT JOIN truck tr
ON tr.id = me.id_truck AND tr.id = 3
GROUP BY ft.id, ft.name
ORDER BY ft.name
or a query in Django
Menu.objects.filter(id_truck=3).values("id_foodtype").annotate(cnt=Count("id_foodtype"))
But, neither is displaying the results with Zero elements.
At the moment to convert this query to Python code, any of my queries return the exact result that I expected.
How can I return results with the Left Join including the foodtypes with zero elements in the menu?

The direction of LEFT JOIN depends on the object, where you start the query. If it start on Menu you will never see a FoodType unused by selected Menu items. Then is important to filter (by Truck in your case) such way that also null value Menu.id is allowed in order to can get Count == 0.
from django.db.models import Q
qs = (
FoodType.objects
.filter(Q(menu_set__id_truck=3) | Q(menu_set__id__isnull=True))
.values() # not necessary, but useful if you want a dict, not a Model object
.annotate(cnt=models.Count("menu_set__id"))
)
Verify:
>>> print(str(qs.query))
SELECT foodtype.id, foodtype..., COUNT(menu.id) AS cnt
FROM foodtype
LEFT OUTER JOIN menu ON (foodtype.id = menu.id_foodtype)
WHERE _menu.id_truck = 3 OR menu.id IS NULL)
GROUP BY foodtype.id
It works with the current newest and oldest Django 2.0b1 and 1.8.
The query is the same with or without the line .values(). The results are dictionaries or FoodType objects with a cnt attribute.
Footnotes:
The name menu_set should be replaced by the real related_name of foreign key id_foodtype if you have defined the related_name.
class Menu(models.Model):
id_foodtype = models.ForeignKey('FoodType', on_delete=models.DO_NOTHING,
db_column='id_foodtype', related_name='menu_set'))
...
If you start a new project I recommend to rename the foreign key to a name without "id" and the db_column field is with "id". Then menu_item.foodtype is a Food object and menu_item.id_foodtype its id.

Related

Django Query to get count of all distinct values for column of ArrayField

What is the most efficient way to count all distinct values for column of ArrayField.
Let's suppose I have a model with the name MyModel and cities field which is postgres.ArrayField.
#models.py
class MyModel(models.Model):
....
cities = ArrayField(models.TextField(blank=True),blank=True,null=True,default=list) ### ['mumbai','london']
and let's suppose our MyModel has the following 3 objects with cities field value as follow.
1. ['london','newyork']
2. ['mumbai']
3. ['london','chennai','mumbai']
Doing a count on distinct values for cities field does on the entire list instead of doing on each element.
## Query
MyModel.objects.values('cities').annotate(Count('id')).order_by().filter(id__count__gt=0)
Here I would like to count distinct values for cities field on each element of the list of cities field.which should give the following final output.
[{'london':2},{'newyork':1},{'chennai':1},{'mumbai':2}]
perform the group by operation in the database level itself.
from django.db import connection
cursor = connection.cursor()
raw_query = """
select unnest(subquery_alias.cities) as distinct_cities, count(*) as cities_group_by_count
from (select cities from sample_mymodel) as subquery_alias group by distinct_cities;
"""
cursor.execute(raw_query)
result = [{"city": row[0], "count": row[1]} for row in cursor]
print(result)
References
unnest()-postgress array function
Django: Executing custom SQL directly
Doing it with an in-efficient way out of Django syllabus:
unique_cities = list(data.values_list('cities',flat=True))
unique_cities_compiled = list(itertools.chain.from_iterable(unique_cities ))
unique_cities_final = {unique_cities_compiled .count(i) for i in unique_cities_compiled }
print(unique_cities_final )
{'london':2},{'newyork':1},{'chennai':1},{'mumbai':2}
if anyone does in much efficient way, do drop the answer for the improvised version of the solution.

Giving relations an order to sort by

Given the following Django models:
class Room(models.Model):
name = models.CharField(max_length=20)
class Beacon(models.Model):
room = models.ForeignKey(Room)
uuid = models.UUIDField(default=uuid.uuid4)
major = models.PostiveIntegerField(max_value=65536)
minor = models.PositiveIntegerField(max_value=65536)
The Beacon model is a bluetooth beacon relationship to the room.
I want to select all Rooms that match a given uuid, major, minor combination.
The catch is, that I want to order the rooms by the beacon that is nearest to me. Because of this, I need to be able to assign a value to each beacon dynamically, and then sort by it.
Is this possible with the Django ORM? In Django 1.8?
NOTE - I will know the ordering of the beacons beforehand, I will be using the order they are passed in the query string. So the first beacon (uuid, major, minor) passed should match the first room that is returned by the Room QuerySet
I am envisioning something like this, though I know this won't work:
beacon_order = [
beacon1 = 1,
beacon0 = 2,
beacon3 = 3,
]
queryset = Room.objects.annotate(beacon_order=beacon_order).\
order_by('beacon_order')
If you already know the order of the beacons, there's no need to sort within the QuerySet itself. Take an ordered list called beacon_list, which contains the beacons' primary keys in order, e.g. the item at index 0 is the closest beacon's primary key, the item at index 1 is the second closest beacon's PK, etc. Then use a list comprehension:
ordered_rooms = [Room.objects.get(pk=x) for x in beacon_list]
You don't have to use the PK either, you can use anything which identifies the given object in the database, e.g. the name field.
Looks like this works:
from django.db.models import Case, Q, When
beacons = request.query_params.getlist('beacon[]')
query = Q()
order = []
for pos, beacon in enumerate(beacons):
uuid, major, minor = beacon.split(':')
query |= Q(
beacon__uuid=uuid,
beacon__major=major,
beacon__minor=minor,
)
order.append(When(
beacon__uuid=uuid,
beacon__major=major,
beacon__minor=minor,
then=pos,
))
rooms = Room.objects.filter(query).order_by(Case(*order))

Django get all values Group By particular one field

I want to execute a simple query like:
select *,count('id') from menu_permission group by menu_id
In Django format I have tried:
MenuPermission.objects.all().values('menu_id').annotate(Count('id))
It selects only menu_id. The executed query is:
SELECT `menu_permission`.`menu_id`, COUNT(`menu_permission`.`id`) AS `id__count` FROM `menu_permission` GROUP BY `menu_permission`.`menu_id`
But I need other fields also. If I try:
MenuPermission.objects.all().values('id','menu_id').annotate(Count('id))
It adds 'id' in group by condition.
GROUP BY `menu_permission`.`id`
As a result I am not getting the expected result. How I can get all all fields in the output but group by a single one?
You can try subqueries to do what you need.
In my case I have two tables: Item and Transaction where item_id links to Item
First, I prepare Transaction subquery with group by item_id where I sum all amount fields and mark item_id as pk for outer query.
per_item_total=Transaction.objects.values('item_id').annotate(total=Sum('amount')).filter(item_id=OuterRef('pk'))
Then I select all rows from item plus subquery result as total filed.
items_with_total=Item.objects.annotate(total=Subquery(per_item_total.values('total')))
This produces the following SQL:
SELECT `item`.`id`, {all other item fields},
(SELECT SUM(U0.`amount`) AS `total` FROM `transaction` U0
WHERE U0.`item_id` = `item`.`id` GROUP BY U0.`item_id` ORDER BY NULL) AS `total` FROM `item`
You are trying to achieve this SQL:
select *, count('id') from menu_permission group by menu_id
But normally SQL requires that when a group by clause is used you only include those column names in the select that you are grouping by. This is not a django matter, but that's how SQL group by works.
The rows are grouped by those columns so those columns can be included in select and other columns can be aggregated if you want them to into a value. You can't include other columns directly as they may have more than one value (since the rows are grouped).
For example if you have a column called "permission_code", you could ask for an array of the values in the "permission_code" column when the rows are grouped by menu_id.
Depending on the SQL flavor you are using, this could be in PostgreSQL something like this:
select menu_id, array_agg(permission_code), count(id) from menu_permissions group by menu_id
Similary django queryset can be constructed for this.
Hopefully this helps, but if needed please share more about what you need to do and what your data models are.
The only way currently that it works as expected is to hve your query based on the model you want the GROUP BY to be based on.
In your case it looks like you have a Menu model (menu_id field foreign key) so doing this would give you what you want and will allow getting other aggregate information from your MenuPermission model but will only group by the Menu.id field:
Menu.objects.annotate(perm_count=Count('menupermission__id')).values('perm_count')
Of course there is no need for the "annotate" intermediate step if all you want is that single count.
query = MenuPermission.objects.values('menu_id').annotate(menu_id_count=Count('menu_id'))
You can check your SQL query by print(query.query)
This solution doesn't work, all fields end up in the group by clause, leaving it here because it may still be useful to someone.
model_fields = queryset.model._meta.get_fields()
queryset = queryset.values('menu_id') \
.annotate(
count=Count('id'),
**{field.name: F(field.name) for field in model_fields}
)
What i'm doing is getting the list of fields of our model, and set up a dictionary with the field name as key and an F instance with the field name as a parameter.
When unpacked (the **) it gets interpreted as named arguments passed into the annotate function.
For example, if we had a "name" field on our model, this annotate call would end up being equal to this:
queryset = queryset.values('menu_id') \
.annotate(
count=Count('id'),
name=F("name")
)
you can use the following code:
MenuPermission.objects.values('menu_id').annotate(Count('id)).values('field1', 'field2', 'field3'...)

Django query aggregate upvotes in backward relation

I have two models:
Base_Activity:
some fields
User_Activity:
user = models.ForeignKey(settings.AUTH_USER_MODEL)
activity = models.ForeignKey(Base_Activity)
rating = models.IntegerField(default=0) #Will be -1, 0, or 1
Now I want to query Base_Activity, and sort the items that have the most corresponding user activities with rating=1 on top. I want to do something like the query below, but the =1 part is obviously not working.
activities = Base_Activity.objects.all().annotate(
up_votes = Count('user_activity__rating'=1),
).order_by(
'up_votes'
)
How can I solve this?
You cannot use Count like that, as the error message says:
SyntaxError: keyword can't be an expression
The argument of Count must be a simple string, like user_activity__rating.
I think a good alternative can be to use Avg and Count together:
activities = Base_Activity.objects.all().annotate(
a=Avg('user_activity__rating'), c=Count('user_activity__rating')
).order_by(
'-a', '-c'
)
The items with the most rating=1 activities should have the highest average, and among the users with the same average the ones with the most activities will be listed higher.
If you want to exclude items that have downvotes, make sure to add the appropriate filter or exclude operations after annotate, for example:
activities = Base_Activity.objects.all().annotate(
a=Avg('user_activity__rating'), c=Count('user_activity__rating')
).filter(user_activity__rating__gt=0).order_by(
'-a', '-c'
)
UPDATE
To get all the items, ordered by their upvotes, disregarding downvotes, I think the only way is to use raw queries, like this:
from django.db import connection
sql = '''
SELECT o.id, SUM(v.rating > 0) s
FROM user_activity o
JOIN rating v ON o.id = v.user_activity_id
GROUP BY o.id ORDER BY s DESC
'''
cursor = connection.cursor()
result = cursor.execute(sql_select)
rows = result.fetchall()
Note: instead of hard-coding the table names of your models, get the table names from the models, for example if your model is called Rating, then you can get its table name with Rating._meta.db_table.
I tested this query on an sqlite3 database, I'm not sure the SUM expression there works in all DBMS. Btw I had a perfect Django site to test, where I also use upvotes and downvotes. I use a very similar model for counting upvotes and downvotes, but I order them by the sum value, stackoverflow style. The site is open-source, if you're interested.

django annotate count filter

I am trying to count daily records for some model, but I would like the count was made only for records with some fk field = xy so I get list with days where there was a new record created but some may return 0.
class SomeModel(models.Model):
place = models.ForeignKey(Place)
note = models.TextField()
time_added = models.DateTimeField()
Say There's a Place with name="NewYork"
data = SomeModel.objects.extra({'created': "date(time_added)"}).values('created').annotate(placed_in_ny_count=Count('id'))
This works, but shows all records.. all places.
Tried with filtering, but it does not return days, where there was no record with place.name="NewYork". That's not what I need.
It looks as though you want to know, for each day on which any object was added, how many of the objects created on that day have a place whose name is New York. (Let me know if I've misunderstood.) In SQL that needs an outer join:
SELECT m.id, date(m.time_added) AS created, count(p.id) AS count
FROM myapp_somemodel AS m
LEFT OUTER JOIN myapp_place AS p
ON m.place_id = p.id
AND p.name = 'New York'
GROUP BY created
So you can always express this in Django using a raw SQL query:
for o in SomeModel.objects.raw('SELECT ...'): # query as above
print 'On {0}, {1} objects were added in New York'.format(o.created, o.count)
Notes:
I haven't tried to work out if this is expressible in Django's query language; it may be, but as the developers say, the database API is "a shortcut but not necessarily an end-all-be-all.")
The m.id is superfluous in the SQL query, but Django requires that "the primary key ... must always be included in a raw query".
You probably don't want to write the literal 'New York' into your query, so pass a parameter instead: raw('SELECT ... AND p.name = %s ...', [placename]).