django orm inner join group by data - django

class Dishes(models.Model):
""" 菜品"""
cuisine_list = ((0, '川菜'), (1, '粤菜'), (2, '徽菜'), (3, '湘菜'))
name = models.CharField('菜名', max_length=100)
material = models.TextField('材料')
cuisine = models.IntegerField('菜系', choices=cuisine_list)
price = models.IntegerField('价格')
def __str__(self):
return self.name
how to select : Information about the most expensive dish in each cuisine, including dish names and ingredients
Dishes.objects.values('cuisine').annotate(max_price=Max("price"))
In this way, we can only find the information with the highest price in each cuisine, excluding the names and ingredients of the dishes. It would be fine if we could query for cuisine and max_price from inner join, but what should we write in ORM?

If using Postgresql you may be able to use the following :
Dishes.objects.all().order_by().order_by(
'cuisine', # to group by this value
'-price' # descending by price will put highest first
).distinct('cuisine')
See https://docs.djangoproject.com/en/3.0/ref/models/querysets/#distinct for a description. Also see https://www.semicolonworld.com/question/61934/django-group-by-one-field-only-take-the-latest-max-of-each-group-and-get-back-the-orm-objects for a further discussion on similar problem and source for this suggestion.
Decent explanation of Distinct On available at: https://www.geekytidbits.com/postgres-distinct-on/

Related

Proper way to annotate a rank field for a queryset

Assume models like this:
class Person(models.Model):
name = models.CharField(max_length=20)
class Session(models.Model):
start_time = models.TimeField(auto_now_add=True)
end_time = models.TimeField(blank=True, null=True)
person = models.ForeignKey(Person)
class GameSession(models.Model):
game_type = models.CharField(max_length=2)
score = models.PositiveIntegerField(default=0, blank=True)
session = models.ForeignKey(Session)
I want to have a queryset function to return total score of each person which is addition of all his games score and all times he has spent in all his sessions alongside with a rank that a person has relative to all persons. Something like below:
class DenseRank(Func):
function = 'DENSE_RANK'
template = '%(function)s() Over(Order by %(expressions)s desc)'
class PersonQuerySet(models.query.QuerySet):
def total_scores(self):
return self.annotate(total_score=some_fcn_for_calculate).annotate(rank=DenseRank('total_score'))
I could find a way to calculate total score, but dense rank is not what I want, because it just calculates rank based on persons in current queryset but I want to calculate rank of a person relative to all persons.
I use django 1.11 and postgres 10.5, please suggest me a proper way to find rank of each person in a queryset because I want to able to add another filter before or after calculating total_score and rank.
Sadly, it is not a possible operation since (to me) the postgresql WHERE operation (filter/exclude) narrows the rows before the aggregation functions can work on them.
The only solution I found is to simply compute the ranking for all Person with a separate queryset and then, to annotate your queryset with these results.
This answer (see the improved method) explains how to "annotate a queryset with externally prepared data in a dict".
Here is the implementation I made for your models:
class PersonQuerySet(models.QuerySet):
def total_scores(self):
# compute the global ranking
ranks = (Person.objects
.annotate(total_score=models.Sum('session__gamesession__score'))
.annotate(rank=models.Window(expression=DenseRank(),
order_by=models.F('total_score').decs()))
.values('pk', 'rank'))
# extract and put ranks in a dict
rank_dict = dict((e['pk'], e['rank']) for e in ranks)
# create `WHEN` conditions for mapping filtered Persons to their Rank
whens = [models.When(pk=pk, then=rank) for pk, rank in rank_dict.items()]
# build the query
return (self.annotate(rank=models.Case(*whens, default=0,
output_field=models.IntegerField()))
.annotate(total_score=models.Sum('session__gamesession__score')))
I tested it with Django 2.1.3 and Postgresql 10.5, so the code may lightly change for you.
Feel free to share a version compatible with Django 1.11!

Bin a queryset using Django?

Let's say we have the following simplistic models:
class Category(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "categories"
class Status(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "status"
class Product(models.Model):
title = models.CharField(max_length=264)
description = models.CharField(max_length=264)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
price = models.DecimalField(max_digits=10)
status = models.ForeignKey(Status, on_delete=models.CASCADE)
My aim is to get some statistics, like total products, total sales, average sales etc, based on which price bin each product belongs to.
So, the price bins could be something like 0-100, 100-500, 500-1000, etc.
I know how to use pandas to do something like that:
Binning column with python pandas
I am searching for a way to do this with the Django ORM.
One of my thoughts is to convert the queryset into a list and apply a function to get the apropriate price bin and then do the statistics.
Another thought which I am not sure how to impliment, is the same as the one above but just apply the bin function to the field in the queryset I am interested in.
There are three pathways I can see.
First is composing the SQL you want to use directly and putting it to your database with a modification of your models manager class. .objects.raw("[sql goes here]"). This answer shows how to define group with a simple function on the content - something like that could work?
SELECT FLOOR(grade/5.00)*5 As Grade,
COUNT(*) AS [Grade Count]
FROM TableName
GROUP BY FLOOR(Grade/5.00)*5
ORDER BY 1
Second is that there is no reason you can't move the queryset (with .values() or .values_list()) into a pandas dataframe or similar and then bin it, as you mentioned. There is probably a bit of an efficiency loss in terms of getting the queryset into a dataframe and then processing it, but I am not sure that it would certainly or always be bad. If its easier to compose and maintain, that might be fine.
The third way I would try (which I think is what you really want) is chaining .annotate() to label points with the bin they belong in, and the aggregate count function to count how many are in each bin. This is more advanced ORM work than I've done, but I think you'd start looking at something like the docs section on conditional aggregation. I've adapted this slightly to create the 'price_class' column first, with annotate.
Product.objects.annotate(price_class=floor(F('price')/100).aggregate(
class_zero=Count('pk', filter=Q(price_class=0)),
class_one=Count('pk', filter=Q(price_class=1)),
class_two=Count('pk', filter=Q(price_class=2)), # etc etc
)
I'm not sure if that 'floor' is going to work, and you may need 'expression wrapper' to ensure the push price_class into the write type of output_field. All the best.

How to query database with conditional expression in Django?

I have three models: Business, Offers and OfferPlan:
Business:
class Business(models.Model):
name_of_business = models.CharField(max_length=255)
Offers:
class Offers(models.Model):
business = models.ForeignKey(Business, related_name="business_offer",
on_delete=models.CASCADE)
title = models.CharField(max_length=255)
subtext = models.CharField(max_length=255)
OfferPlan:
class OfferPlan(models.Model):
WEEKDAYS = [
(1, _("Monday")),
(2, _("Tuesday")),
(3, _("Wednesday")),
(4, _("Thursday")),
(5, _("Friday")),
(6, _("Saturday")),
(7, _("Sunday")),
]
offer = models.ForeignKey(Offers, related_name="business_offer_plan",
on_delete=models.CASCADE)
weekday = models.IntegerField(
choices=WEEKDAYS,
)
from_hour = models.TimeField()
to_hour = models.TimeField()
I have a ListView which search for businesses open based on different params such as city, category etc. I also want to now search by weekday, say which business is open on Monday will be displayed and which are not wont be displayed on that day. Weekday information is stored in OfferPlan and there could be multiple timings for the offers that day in OfferPlan table, but I want to query (filter, exclude) the businesses who has even a single entry on that weekday number.
Here is my ListView:
class SearchListView(ListView):
template_name = 'search/search.html'
model = Business
def get_queryset(self):
# queryset = Business.objects.filter(business_address__city=AppLocations.objects.first().city)
if 'city' in self.request.GET:
queryset = Business.objects.filter(business_address__city=self.request.GET.get('city'))
if 'category' in self.request.GET:
queryset = queryset.filter(category__code=self.request.GET.get('category'))
# if 'date' not in self.request.GET:
# queryset = B
raise
return queryset
How could this be possible? Also looked into https://docs.djangoproject.com/en/1.8/ref/models/conditional-expressions/ but not able to figure out.
Thanks
Update 1
After researching more in the web, I figured out this is how it could be achieved, but need to know for sure from other Django enthusiasts here that it is right.
queryset.filter(business_offer__business_offer_plan__weekday=1).annotate(count_entry=Count('business_offer__business_offer_plan__weekday')).filter(count_entry__gt=1)
Solution
Jefferson's solution was tagged as right answer as it provided more insights, about which query is fast and what wrong was with my previous update, so here is the proper solution to which we both agreed:
queryset.filter(business_offer__business_offer_plan__weekday=1).annotate(count_entry=Count('business_offer__business_offer_plan__weekday')).filter(count_entry__gte=1)
def get_query(weekday):
businesses = Business.objects.filter(business_offer__in=Offers.objects.filter(
business_offer_plan__in=OfferPlan.objects.filter(weekday=weekday))).distinct()
return businesses
There's a heavy query, but it works.
There's no conditional expression here - and your annotation is much too complicated. You just need an additional filter.
queryset.filter(business_offer__business_offer_plan__weekday=self.request.GET['weekday'])

Django advanced join / query, how to filter foreign keys?

I have this two models.
class City(models.Model):
city = models.CharField(max_length=200)
country = models.CharField(max_length=200)
class CityTranslation(models.Model):
city = models.ForeignKey(City)
name = models.CharField(max_length=200)
lang = models.CharField(max_length=2)
prio = models.IntegerField()
Every city can have multiple translated names within one language.
So I want to get all City's with country="Poland". If a corresponding City has one or more CityTranslations with lang=name. I want to get only the first ordered by prio.
I am doing something like that now.
City.objects.filter(country="Poland", citytranslation__land="pl").annotate(transname=F("alt_names__name"))
But this is not working, because:
If there is a City without a CityTranslation it want be listed
If there are multiple CityTranslation's they all will be shown. But I just want the first. (... .ordered_by('prio').first())
Any idea?
EDIT:
Solved it by using a #property field, which is ordering my CityTranslation by prio and picks the first one:
#propert
def transcity(self):
return self.citytranslation.filter(lang="pl").order_by('-prio').first()
def magic(passed_country="Poland", passed_lang="pl")
# I want to get all City's with country="Poland".
cities = City.objects.filter(country=passed_country)
# If a corresponding City has one or more CityTranslations with lang=name. I want to get only the first ordered by prio.
suitable_cities = cities.filter(citytranslation__lang=passed_lang)
if suitable_cities.exists()
first_matching_city = suitable_cities.orderby('prio').first()
else:
first_matching_city = cities.orderby('prio').first()
return first_matching_city
May need to set up a relatedname on citytranslation.
May not need orderby if you plan on ordering by ID anyways.

Django query - Is it possible to group elements by common field at database level?

I have a database model as shown below. Consider the data as 2 different books each having 3 ratings.
class Book(models.Model):
name = models.CharField(max_length=50)
class Review(models.Model):
book = models.ForeignKey(Book)
review = models.CharField(max_length=1000)
rating = models.IntegerField()
Question : Is it possible to group all the ratings in a list, for each book with a single query. I'm looking to do this at database level, without iterating over the Queryset in my code. Output should look something like :
{
'book__name':'book1',
'rating' : [3, 4, 4],
'average' : 3.66,
'book__name':'book2',
'rating : [2, 1, 1] ,
'average' : 1.33
}
I've tried this query, but neither are the ratings grouped by book name, nor is the average correct :
Review.objects.annotate(average=Avg('rating')).values('book__name','rating','average')
Edit : Added clarification that I'm looking for a method to group the elements at database level.
You can do this. Hope this helps.
Review.objects.values('book__name').annonate(average=Avg('rating'))
UPDATE:
If you want all the ratings of a particular book in a list, then you can do this.
from collections import defaultdict
ratings = defaultdict(list)
for result in Review.objects.values('book__name', 'rating').order_by('book__name', 'rating'):
ratings[result['book__name']].append(result['rating'])
You will get a structure like this :
[{ book__name: [rating1, rating2, ] }, ]
UPDATE:
q = Review.objects.values('book__name').annonate(average=Avg('rating')).filter().prefetech_related('rating')
q[0].ratings.all() # gives all the ratings of a particular book name
q[0].average # gives average of all the ratings of a particular book name
Hope this works (I'm not sure, sorry), but you need to add related_ name attribute
class Review(models.Model):
book = models.ForeignKey(Book, related_name='rating')
UPDATE:
Sorry to say, but you need something called as GROUP_CONCAT in SQL , but it is not supported in Django ORM currently.
You can use Raw SQL or itertools
from django.db import connection
sql = """
SELECT name, avg(rating) AS average, GROUP_CONCAT(rating) AS rating
FROM book JOIN review on book.id = review.book_id
GROUP BY name
"""
cursor = connection.cursor()
cursor.execute(sql)
data = cursor.fetchall()
DEMO