Error sum in sql request - django

I have model:
class Sales(ModelWithCreator, models.Model):
date = models.DateField(default=datetime.date.today)
merchandise = models.ForeignKey(Merchandise)
partner = models.ForeignKey(Partner)
count = models.PositiveIntegerField()
debt = models.PositiveIntegerField(default=0)
price = models.PositiveIntegerField()
cost = models.PositiveIntegerField()
I have to get sum of all objects that model.
I tried that:
Sales.objects.all().extra(select={
'total': 'Sum(price * count - cost)'
})
But, I got error:
column "sales_sales.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT (Sum(price * count - cost)) AS "total", "sales_sales"...

I think you might want to use the .aggregate() function. See the Django documentation for Aggregation.
Something like this (though I haven't tested it):
Sales.objects.all().aggregate( total = Sum((F(price)*F(count)) - F(cost)) )
Also, that same documentation suggests using the .query() function to see what the resulting query is. That might help you see what is going wrong.

Try this
Sales.objects.all().extra(select={
'total': 'Sum(price * count - cost)'
}).values("total")

Related

Django annotation on compoundish primary key with filter ignoring primary key resutling in too many annotated items

Please see EDIT1 below, as well.
Using Django 3.0.6 and python3.8, given following models
class Plants(models.Model):
plantid = models.TextField(primary_key=True, unique=True)
class Pollutions(models.Model):
pollutionsid = models.IntegerField(unique=True, primary_key=True)
year = models.IntegerField()
plantid = models.ForeignKey(Plants, models.DO_NOTHING, db_column='plantid')
pollutant = models.TextField()
releasesto = models.TextField(blank=True, null=True)
amount = models.FloatField(db_column="amount", blank=True, null=True)
class Meta:
managed = False
db_table = 'pollutions'
unique_together = (('plantid', 'releasesto', 'pollutant', 'year'))
class Monthp(models.Model):
monthpid = models.IntegerField(unique=True, primary_key=True)
year = models.IntegerField()
month = models.IntegerField()
plantid = models.ForeignKey(Plants, models.DO_NOTHING, db_column='plantid')
power = models.IntegerField(null=False)
class Meta:
managed = False
db_table = 'monthp'
unique_together = ('plantid', 'year', 'month')
I'd like to annotate - based on a foreign key relationship and a fiter a value, particulary - to each plant the amount of co2 and the Sum of its power for a given year. For sake of debugging having replaced Sum by Count using the following query:
annotated = tmp.all().annotate(
energy=Count('monthp__power', filter=Q(monthp__year=YEAR)),
co2=Count('pollutions__amount', filter=Q(pollutions__year=YEAR, pollutions__pollutant="CO2", pollutions__releasesto="Air")))
However this returns too many items (a wrong number using Sum, respectively)
annotated.first().co2 # 60, but it should be 1
annotated.first().energy # 252, but it should be 1
although my database guarantees - as denoted, that (plantid, year, month) and (plantid, releasesto, pollutant, year) are unique together, which can easily be demonstrated:
pl = annotated.first().plantid
testplant = Plants.objects.get(pk=pl) # plant object
pco2 = Pollutions.objects.filter(plantid=testplant, year=YEAR, pollutant="CO2", releasesto="Air")
len(pco2) # 1, as expected
Why does django return to many results and how can I tell django to limit the elements to annotate to the 'current primary key' in other words to only annotate the elements where the foreign key matches the primary key?
I can achieve what I intend to do by using distinct and Max:
energy=Sum('yearly__power', distinct=True, filter=Q(yearly__year=YEAR)),
co2=Max('pollutions__amount', ...
However the performance is inacceptable.
I have tested to use model_to_dict and appending the wanted values "by hand" to the dict, which works for the values itself, but not for sorting the resulted dict (e.g. by energy) and it is acutally faster than the workaround directly above.
It conceptually strikes to me that the manual approach is faster than letting the database do, what it is intended to do.
Is this a feature limitation of django's orm or am I missing something?
EDIT1:
The behaviour is known as bug since 11 years.
Even others "spent a whole day on this".
I am now trying it with subqueries. However the forein key I am using is not a primary key of its table. So the kind of "usual" approach to use "pk=''" does not work. More clearly, trying:
tmp = Plants.objects.filter(somefilter)
subq1 = Subquery(Yearly.objects.filter(pk=OuterRef('plantid'), year=YEAR)) tmp1 = tmp.all().annotate(
energy=Count(Subquery(subq1))
)
returns
OperationalError at /xyz
no such column: U0.yid
Which definitely makes sense because Plants has no clue what a yid is, it only knows plantids. How do I adjust the subquery to that?

Django - aggregate fields value from joined model

my goal here seems to be simple: display the Sum (aggregation) of a foreign model particular field.
The difficulty consist in the current set-up, kindly take a look and let me know if this need to be changed or I can achieve the goal with current model:
class Route(models.Model):
name = models.CharField(max_length=50)
route_length = models.IntegerField()
class Race(models.Model):
race_cod = models.CharField(max_length=6, unique=True)
route_id = models.ForeignKey(Route, on_delete=models.CASCADE, related_name='b_route')
class Results(models.Model):
race_id = models.ForeignKey(Race, on_delete=models.CASCADE, related_name='r_race')
runner_id = models.ForeignKey(Runner, on_delete=models.CASCADE, related_name='r_runner')
Now, I am trying to have like a year summary:
Runner X have raced in 12 races with a total distance of 134 km.
While I was able to count the number of races like this (views.py)
runner = Runner.objects.get(pk=pk)
number_races = Results.objects.filter(runner_id=runner).count()
For computing the distance I have tried:
distance = Results.objects.filter(runner_id=runner).annotate(total_km=Sum(race_id.route_id.route_length))
This code error out stating that on views.py - distance line
Exception Type: NameError
Exception Value: name 'race_id' is not defined
I am sure I did not u/stood exactly how this works. Anybody kind enough to clarify this issue?
Thank you
My workaround is the following :
tmp_race_id = Results.objects.filter(runner_id=runner).values('race_id')
tmp_route_id = Race.objects.filter(pk__in=tmp_race_id).values('route_id')
distance = Route.objects.filter(pk__in=tmp_route_id).aggregate(Sum("route_length "))['route_length __sum'] or 0.00
Thank you Jorge Lopez for the hint.
you donĀ“t need a Results Model, you can calculate using the data in the models, can you share your Runner Model? that model needs to have a foreign key to a Race. if that is so, you can go from Route's -> Race -> Runner in your query, and you can use the query for the count, so you will have a variable where you stored the count and a variable where you stored the distance. To do a Sum in your query do not use annotate, use aggregate, something like this:
.aggregate(total=Coalesce(Sum('route_lenght'), 0))['total']
do like this
from django.db.models import Sum, Count
u = Runner.objects.annotate(
tot_result=Count('r_runner'),
tot_km=Sum('r_runner__race_id__route_id__route_length')
)
for i in u:
print('Total_race {} -- Total_Km {}'.format(i.tot_result, i.tot_km))

Using property in Django

I need help with the following situation.
My app has the following models:
class Person(models.Model):
person_sequential_nr = models.IntegerField(primary_key=true)
person_id = models.CharField(max_length=10)
person_name = models.CharField(max_length=200)
year = models.CharField(max_length=4)
#property
def _summarize_goods(self):
return self.goods_set.all().aggregate(Sum('good_value')).values()
patrimony = property(_summarize_goods)
class Goods:
person_sequential_nr = models.Foreignkey()
good_description = models.CharField(max_length=200)
good_value = models.DecimalField(max_digits=12, decimal_places=2)
year = models.CharField(max_length=4)
Year is a string like 2012, 2010, 2008, etc
person_sequential_nr is specific (different) for each year.
person_id is the same for all years.
The intention of _summarize_goods is to totalize all goods of a person in a specific year.
1) How can I get the top ten people with the highest patrimonies?
When I call Person.patrimony it says "TypeError: 'property' object is not callable"
Person._summarize_goods works, but I have no idea how to order it.
2) How can I calculate the patrimony variation from of a person from one year to another (in the past)?
I would like to have something like: variation = (patrimony(year='2012')/patrimony(year='2010') - 1) * 100
I suppose that variation should be also a property, because I would like to use it to order some records.
An additional problem is the the person data may exist in 2012, but may not exist in a year in the past (e.g. 2010). So I need to handle this situation.
3) How can I create a view to show the patrimony of a person?
self.goods_set.all().aggregate(Sum('good_value')) was returning a dictionary, so I added .values() to extract only the values of it, then I got a list.
But wen I use str(Person._summarize_goods) it seemns that I still have a list.
Because when I call:
all_people = Person.objects.all()
people_list = [[p.person_name, str(p._summarize_goods)] for p in all_people]
output = ','.join(people_list)
return HttpResponse(output)
It shows an error referring the line output =
TypeError at /
sequence item 0: expected string, list found
EDITING...
Find some answers:
After removing the decoration (thanks Daniel) Person.patrimony is working so:
1) How can I get the top ten people with the highest patrimonies?
This was solved with the code below:
def ten_highest_pat():
people_2012 = Person.objects.all().filter(year=2012)
return sorted(people_2012, key=lambda person: person.patrimony, reverse=True)[:10]
2) How can I calculate the patrimony variation from of a person from one year to another (in the past)?
I tried the code below, which works, but is too slow. So I would thank you if someone has a sugestion how can I improve it.
def _calc_pat_variation(self):
c = Person.objects.all().filter(person_id = self.person_id, year__lt=self.year).order_by('-year')
if c.count() >= 1 and c[0].patrimony != 0:
return ((self.patrimony / c[0].patrimony) - 1) * 100
else:
return 'Not available'
pat_variation = property(_calc_pat_variation)

How do I use Django's "extra" method to filter by a calculated property?

I've got a search function in my app that receives "cities" and "duration" inputs (both lists) and returns the top 30 matching "package" results sorted by package "rating".
It would be easy to implement if all the parameters were columns, but "duration" and "rating" are calculated properties. This means that I can't use a standard Django query to filter the packages. It seems that Django's "extra" method is what I need to use here, but my SQL isn't great and this seems like a pretty complex query.
Is the extra method what I should be using here? If so, what would that statement look like?
Applicable code copied below.
#models.py
class City(models.Model):
...
city = models.CharField(max_length = 100)
class Package(models.Model):
....
city = models.ManyToManyField(City, through = 'PackageCity')
#property
def duration(self):
duration = len(Itinerary.objects.filter(package = self))
return duration
#property
def rating(self):
#do something to get the rating
return unicode(rating)
class PackageCity(models.Model):
package = models.ForeignKey(Package)
city = models.ForeignKey(City)
class Itinerary(models.Model):
# An Itinerary object is a day in a package, so len(Itinerary) works for the duration
...
package = models.ForeignKey(Package)
#functions.py
def get_packages(city, duration):
cities = City.objects.filter(city = city) # works fine
duration_list = range(int(duration_array[0], 10), int(duration_array[1], 10) + 1) # works fine
#What I want to do, but can't because duration & rating are calculated properties
packages = Package.objects.filter(city__in = cities, duration__in = duration_array).order_by('rating')[:30]
First off, don't use len() on Querysets, use count().
https://docs.djangoproject.com/en/dev/ref/models/querysets/#when-querysets-are-evaluated
Second, assuming you're doing something like calculating an average rating with your rating property you could use annotate:
https://docs.djangoproject.com/en/dev/ref/models/querysets/#annotate
Then you can do something like the following:
queryset = Package.objects.annotate({'duration': Count('related-name-for-itinerary', distinct=True), 'rating': Avg('packagereview__rating')})
Where "PackageReview" is a fake model I just made that has a ForeignKey to Package, and has a "rating" field.
Then you can filter the annotated queryset as described here:
https://docs.djangoproject.com/en/dev/topics/db/aggregation/#filtering-on-annotations
(Take note of the clause order differences between annotate -> filter, and filter -> annotate.
Properties are calculated at run time, so you really can't use them for filtering or anything like that.

"SELECT...AS..." with related model data in Django

I have an application where users select their own display columns. Each display column has a specified formula. To compute that formula, I need to join few related columns (one-to-one relationship) and compute the value.
The models are like (this is just an example model, actual has more than 100 fields):
class CompanyCode(models.Model):
"""Various Company Codes"""
nse_code = models.CharField(max_length=20)
bse_code = models.CharField(max_length=20)
isin_code = models.CharField(max_length=20)
class Quarter(models.Model):
"""Company Quarterly Result Figures"""
company_code = models.OneToOneField(CompanyCode)
sales_now = models.IntegerField()
sales_previous = models.IntegerField()
I tried doing:
ratios = {'growth':'quarter__sales_now / quarter__sales_previous'}
CompanyCode.objects.extra(select=ratios)
# raises "Unknown column 'quarter__sales_now' in 'field list'"
I also tried using raw query:
query = ','.join(['round((%s),2) AS %s' % (formula, ratio_name)
for ratio_name, formula in ratios.iteritems()])
companies = CompanyCode.objects.raw("""
SELECT `backend_companycode`.`id`, %s
FROM `backend_companycode`
INNER JOIN `backend_quarter` ON ( `backend_companycode`.`id` = `backend_companyquarter`.`company_code_id` )
""", [query])
#This just gives empty result
So please give me a little clue as to how I can use related columns preferably using 'extra' command. Thanks.
By now the Django documentation says that one should use extra as a last resort.
So here is a query without extra():
from django.db.models import F
CompanyCode.objects.annotate(
growth=F('quarter__sales_now') / F('quarter__sales_previous'),
)
Since the calculation is being done on a single Quarter instance, where's the need to do it in the SELECT? You could just define a ratio method/property on the Quarter model:
#property
def quarter(self):
return self.sales_now / self.sales_previous
and call it where necessary
Ok, I found it out. In above using:
CompanyCode.objects.select_related('quarter').extra(select=ratios)
solved the problem.
Basically, to access any related model data through 'extra', we just need to ensure that that model is joined in our query. Using select_related, the query automatically joins the mentioned models.
Thanks :).