Let's say I'm using Django to manage a database about athletes:
class Player(models.Model):
name = models.CharField()
weight = models.DecimalField()
team = models.ForeignKey('Team')
class Team(models.Model):
name = models.CharField()
sport = models.ForeignKey('Sport')
class Sport(models.Model):
name = models.CharField()
Let's say I wanted to compute the average weight of the players on each team. I think I'd do:
Team.objects.annotate(avg_weight=Avg(player__weight))
But now say that I want to compute the variance of team weights within each sport. Is there a way to do that using the Django ORM? How about using the extra() method on a QuerySet? Any advice is much appreciated.
you can use query like this :
class SumSubquery(Subquery):
template = "(SELECT SUM(`%(field)s`) From (%(subquery)s _sum))"
output_field = models.Floatfield()
def as_sql(self, compiler, connection, template=None, **extra_context):
connection.ops.check_expression_support(self)
template_params = {**self.extra, **extra_context}
template_params['subquery'], sql_params = self.queryset.query.get_compiler(connection=connection).as_sql()
template_params["field"] = list(self.queryset.query.annontation_select_mask)[0]
sql = template % template_params
return sql, sql_params
Team.objects.all().values("sport__name").annotate(variance=SumSubquery(Player.objects.filter(team__sport_id=OuterRef("sport_id")).annotate(sum_pow=ExpressionWrapper((Avg("team__players__weight") - F("weight"))**2,output_field=models.Floatfield())).values("sum_pow"))/(Count("players", output_field=models.FloatField())-1))
and add related name to model like this:
class Player(models.Model):
name = models.CharField()
weight = models.DecimalField()
team = models.ForeignKey('Team', related_name="players")
I'm going to assume (perhaps incorrectly) that you mean by 'variance' the difference between maximum and minimum weights. If so, you can generate more than one aggregate with a single query, like so:
from django.db.models import Avg, Max, Min
Team.objects.aggregate(Avg('player__weight'), Max('player__weight'), Min('player__weight'))
This is taken from the django docs on generating aggregation over a queryset.
Related
I'm trying to sort (order) by statistical data stored in a ManyToOne relationship. Suppose I have the following code:
class Product(models.Model):
info = ...
data = models.IntegerField(default=0.0)
class Customer(models.Model):
info = ...
purchases = models.ManyToManyField(Product, related_name='customers', blank=True)
class ProductStats(models.Model):
ALL = 0
YOUNG = 1
OLD = 2
TYPE = ((ALL, 'All'), (YOUNG, 'Young'), (OLD, 'Old'),)
stats_type = models.SmallIntegerField(choices=TYPE)
product = models.ForeignKey(Product, related_name='stats', on_delete=models.CASCADE)
data = models.FloatField(default=0.0)
Then I would like to sort the products by their stats for the ALL demographic (assume every product has a stats connected to it for ALL). This might look something like the following:
products = Product.objects.all().order_by('stats__data for stats__stats_type=0')
Currently the only solution I can think of is either to create a new stats class just for all and use a OneToOneField for Product. Or, add a OneToOneField for Product pointing to the ALL stats in ProductStats.
Thank you for your help.
How about like this using multiple fields in order_by:
Product.objects.all().order_by('stats__data', 'stats__stats_type')
# it will order products from stats 0, then 1 then 2
Or if you want to get data for only stats_type 0:
Product.objects.filter(stats__stats_type=0).order_by('stats__data')
You can annotate the value of the relevant demographic and order by that:
from django.db.models import F
Product.objects.all().filter(stats__stats_type=0).annotate(data_for_all=F('stats__data').order_by('data_for_all')
class Nutrient(models.Model):
tagname = models.CharField(max_length=10)
class FoodNutrientAmount(models.Model):
nutrient = models.ForeignKey(Nutrient)
food = models.ForeignKey(Food)
amount = models.FloatField()
class Food(models.Model):
nutrients = models.ManyToManyField(
Nutrient,
through=FoodNutrientAmount,
)
So, I can get the Foods ordered by the amount of tagname=FOL Nutrient with a list comprehension:
ordered_fnas = FoodNutrientAmount.objects.filter(
nutrient__tagname="FOL"
).order_by('-amount')
ordered_foods_by_most_fol = [fna.food for fna in ordered_fnas]
Can I get such an iterable as a queryset without taking the whole thing into memory?
Maybe there is a different approach using Food.objects.annotate or extra? I can't think of a great way to do it at the moment.
I can get close with values_list; but, I get the ordered list of pks and not the queryset of Food objects that I want.
FoodNutrientAmount.objects.filter(
nutrient__tagname='FOL'
).order_by('-amount').values_list('food', flat=True)
Edit:
This is a Many-to-many relationship. So you can probably leverage that. How about adding default ordering to FoodNutrientAmount and then you can just do normal manytomany queries.
class FoodNutrientAmount(models.Model):
nutrient = models.ForeignKey(Nutrient)
food = models.ForeignKey(Food)
amount = models.FloatField()
class Meta:
ordering = ('-amount',)
Then you can just call -
nutritious_foods = Food.objects.filter(nutrients__tagname='FOL').order_by('foodnutrientamount')
I currently have two different models.
class Journal(models.Model):
date = models.DateField()
from_account = models.ForeignKey(Account,related_name='transferred_from')
to_account = models.ForeignKey(Account,related_name='transferred_to')
amount = models.DecimalField(max_digits=8, decimal_places=2)
memo = models.CharField(max_length=100,null=True,blank=True)
class Ledger(models.Model):
date = models.DateField()
bank_account = models.ForeignKey(EquityAccount,related_name='paid_from')
account = models.ForeignKey(Account)
amount = models.DecimalField(max_digits=8, decimal_places=2)
name = models.ForeignKey(Party)
memo = models.CharField(max_length=100,null=True,blank=True)
I am creating a report in a view and get the following error:
Merging 'ValuesQuerySet' classes must involve the same values in each case.
What I'm trying to do is only pull out the fields that are common so I can concatenate both of them e.g.
def report(request):
ledger = GeneralLedger.objects.values('account').annotate(total=Sum('amount'))
journal = Journal.objects.values('from_account').annotate(total=Sum('amount'))
report = ledger & journal
...
If I try to make them exactly the same to test e.g.
def report(request):
ledger = GeneralLedger.objects.values('memo').annotate(total=Sum('amount'))
journal = Journal.objects.values('memo').annotate(total=Sum('amount'))
report = ledger & journal
...
I get this error:
Cannot combine queries on two different base models.
Anyone know how this can be accomplished?
from itertools import chain
report = chain(ledger, journal)
Itertools for the win!
If you want to do an Union, you should convert these querysets into python set objects.
If it is possible to filter the queryset itself rightly, you should really do that!
Use itertools.chain:
from itertools import chain
report = list(chain(ledger, journal))
Note: you need to turn the resulting object into a list for Django to be able to process it.
I had the same issue. I solved it using the union method combined_queryset = qs1.union(qs2)
Using your example: report = ledger.union(journal)
I have a query such that
em =Employer.objects.filter(id=1).annotate(overall_value = Sum('companyreview__overallRating'))
em[0].overall_value
As you see I want to sum of overallRating field of all companyreview objects whose employer has id = 1.
The query above does what I want but I am sure that there is a way to get the sum from an Employer instance.
How can I implement this query like
em =Employer.objects.get(id=1)
rate = em.companyreview_set.all().annotate(overall_value = Sum('overallRating'))
rate.overall_value
?
Thanks
Use aggregate:
e.companyreview_set.aggregate(overall_value = Sum('overall_rating'))
For:
class Employer(models.Model):
name = models.CharField(max_length=100)
class CompanyReview(models.Model):
employer = models.ForeignKey(Employer)
overall_rating = models.IntegerField()
Short description: given a queryset myQueryset, how do I select max("myfield") without actually retrieving all rows and doing max in python?
The best I can think of is max([r["myfield"] for r in myQueryset.values("myfield")]), which isn't very good if there are millions of rows.
Long description: Say I have two models in my Django app, City and Country. City has a foreign key field to Country:
class Country(models.Model):
name = models.CharField(max_length = 256)
class City(models.Model):
name = models.CharField(max_length = 256)
population = models.IntegerField()
country = models.ForeignKey(Country, related_name = 'cities')
This means that a Country instance has .cities available. Let's say I now want to write a method for Country called highest_city_population that returns the population of the largest city. Coming from a LINQ background, my natural instinct is to try myCountry.cities.max('population') or something like that, but this isn't possible.
Use Aggregation (new in Django 1.1). You use it like this:
>>> from django.db.models import Max
>>> City.objects.all().aggregate(Max('population'))
{'population__max': 28025000}
To get the highest population of a City for each Country, I think you could do something like this:
>>> from django.db.models import Max
>>> Country.objects.annotate(highest_city_population = Max('city__population'))