How to calculate sum of the difference between two dates - django

I have this model
class Exemple(models.Model):
from_date = models.DateField()
until_date = models.DateField()
person = models.ForeignKey(Person, on_delete=models.CASCADE)
I have a number per year, exemple 100 and I must to decrease that number from the sum of days of that person. I must to calculate day on every row of that person and then to make sum and then 100 - sum of days

Considering persons contains your persons, you could do something like that :
for person in persons:
sum = 0
for exemple in Exemple.objects.filter(person=person):
sum += max(1, exemple.until_date - exemple.from_date)
Explanation :
1) You do the computation person per person
2) For each person, you browse every exemple
3) You sum every "until - from". The max() is here to return 1 if until_date = from_date is equal to 0 (because you said you don't want it to be 0)
I don't know if you want to store it somewhere or if you just want to do it in a method so I just wrote this little sample of code to provide you the logic. You'll have to adapt it to suit your needs.
However this might not be the prettier way to achieve your goal.

Related

Django, get sum of amount for the last day?

class Point(models.Model):
user = models.ForeignKey(User)
expire_date = models.DateField()
amount = models.IntegerField()
I want to know sum of amount for the last expire_date for a given user
There could be multiple points for a user and with same expire_date
I could do two query to get first last expire_date and aggregate on those. but wanna know if there's better way.
We can use a subquery here:
from django.db.models import Sum
Point.objects.filter(
expire_date__gte=Point.objects.order_by('-expire_date').values('expire_date')[:1]
).aggregate(total=Sum('amount'))
This will thus result in a query that looks like:
SELECT SUM(point.amount) AS total
FROM point
WHERE point.expire_date >= (
SELECT U0.expire_date
FROM point U0
ORDER BY U0.expire_date DESC
LIMIT 1
)
I have not ran performance tests on it, so I suggest you first try to measure if this will improve performance significantly.

How to use an aggregate in a case statement in Django

I am trying to use an aggregated column in a case statement in Django and I am having no luck getting Django to accept it.
The code is to return a list of people who have played a game, the number of times they have played the game and their total score. The list is sorted by total score descending. However, the game has a minimum number of plays in order to qualify. Players without sufficient plays are listed at the bottom. For example:
Player Total Plays
Jill 109 10
Sam 92 11
Jack 45 9
Sue 50 3
Sue is fourth in the list because her number of plays (3) is less than the minimum (5).
The relevant models and function are:
class Player(models.Model):
name = models.CharField()
class Game(models.Model):
name = models.CharField()
min_plays = models.IntegerField(default=1)
class Play(models.Model):
game = models.ForeignKey(Game)
class Score(models.Model):
play = models.ForeignKey(Play)
player = models.ForeignKey(Player)
score = models.IntegerField()
def game_standings(game):
query = Player.objects.filter(score__play__game_id=game.id)
query = query.annotate(plays=Count('score', filter=Q(score__play__game_id=self.id)))
query = query.annotate(total_score=Sum('score', filter=Q(score__play__game_id=self.id)))
query = query.annotate(sufficient=Case(When(plays__ge=game.minimum_plays, then=1), default=0)
query = query.order_by('-sufficient', '-total_score', 'plays')
When the last annotate method is hit, a "Unsupported lookup 'ge' for IntegerField or join on the field not permitted" error is reported. I tried to change the case statement to embed the count instead of using the annotated field:
query = query.annotate(
sufficient=Case(When(
Q(Count('score', filter=Q(score__play__game_id=game.id)))> 3, then=1), default=0
)
)
but Django reports a TypeError with '>' and Q and int.
The SQL I am trying to get to is:
SELECT "player"."id",
"player"."name",
COUNT("score"."id") FILTER (WHERE "play"."game_id" = 8) AS "plays",
SUM("score"."score") FILTER (WHERE "play"."game_id" = 8) AS "total_score",
case when COUNT("score"."id") FILTER (WHERE "play"."game_id" = 8) >= 5 then 1
else 0
end as sufficient
FROM "player"
LEFT OUTER JOIN "score" ON ("player"."id" = "score"."player_id")
LEFT OUTER JOIN "play" ON ("score"."play_id" = "play"."id")
WHERE "play"."game_id" = 8
GROUP BY "player"."id"
ORDER BY sufficent desc, total_score desc
I can't seem to figure out how to have the case statement use to play count.
Thanks

Duplicates in Django QuerySet when using annotate + filter + annotate

I'd like to annotate a queryset based on both all related objects and filtered subset. Let's say we have some books and they are sold at some stores at some prices. Now, for one book, I'd like to get all the stores which sell that book, the price of the book in those stores and the average price of books in each of those stores.
My models.py:
from django.db import models
class Book(models.Model):
title = models.CharField(max_length=100)
class Store(models.Model):
name = models.CharField(max_length=100)
books = models.ManyToManyField(Book, through='BookInStore')
class BookInStore(models.Model):
book = models.ForeignKey(Book)
store = models.ForeignKey(Store)
price = models.IntegerField()
class Meta:
unique_together = ('book', 'store')
Create some objects:
book1 = Book.objects.create(title='book1')
book2 = Book.objects.create(title='book2')
book3 = Book.objects.create(title='book3')
store = Store.objects.create(name='store')
BookInStore.objects.create(book=book1, store=store, price=10)
BookInStore.objects.create(book=book2, store=store, price=100)
BookInStore.objects.create(book=book3, store=store, price=1000)
Now, for book1, I'm trying filter the stores that sell book1, get the prices in those stores and also the average price of all books in each of the stores:
book_availability = (
Store.objects
.annotate(avg_price=Avg('bookinstore__price'))
.filter(bookinstore__book=book1)
.annotate(
duplicates=Count('bookinstore__price'),
book_price=Sum('bookinstore__price')
)
)
However, it doesn't work correctly:
for b in book_availability:
print("Avg price:", b.avg_price)
print("Number of copies (should be 1):", b.duplicates)
print("Price of book1 (should be 10):", b.book_price)
I get the following output:
Avg price: 370.0
Number of copies (should be 1): 3
Price of book1 (should be 10): 30
The average price is correct. But for some reason, the price of the book has been multiplied by the number of total books in the store. What am I doing wrong? How should I get the kind of queryset I'm after?
One workaround: Because book&store combinations are unique, it shouldn't matter whether one uses Sum, Avg, Min, Max or something similar for getting the price of the particular book, because there should be only one item. However, using Sum sums the weird duplicates, so it is better to use Min or Max to get the correct integer because they give the correct answer in the presence of duplicates:
book_availability = (
Store.objects
.annotate(avg_price=Avg('bookinstore__price'))
.filter(bookinstore__book=book1)
.annotate(
duplicates=Count('bookinstore__price'),
book_price=Min('bookinstore__price')
)
)
And printing now:
for b in book_availability:
print("Avg price:", b.avg_price)
print("Number of copies (should be 1):", b.duplicates)
print("Price of book1 (should be 10):", b.book_price)
We see the correct book price now:
Avg price: 370.0
Number of copies (should be 1): 3
Price of book1 (should be 10): 10
This workaround happens to give the correct answer but it doesn't remove the duplicates, so I'm still wondering if there is something wrong in the approach.
One possibility is to use Case and When to do the filtering for the annotation:
book_availability = (
Store.objects
.annotate(
avg_price=Avg('bookinstore__price'),
book_price=Sum(Case(When(
bookinstore__book=book1,
then='bookinstore__price'
)))
)
.filter(bookinstore__book=book1)
)
for b in book_availability:
print("Avg price:", b.avg_price)
print("Price of book1 (should be 10):", b.book_price)
Which gives:
Avg price: 370.0
Price of book1 (should be 10): 10

Rating calculation

Maybe is a dumb question but I'm figthing with a rating calculation. I've a global rating which is between 0 and 1, the number of ratings, the user rating.
How can I calculate the new rating with :
the old global rating
the user rating
the new and old number of ratings
Any idea ?
Thank you.
Multiply the previous average rating by the previous number of
ratings.
Add the new rating.
Divide by the previous number of ratings plus
one to get the new average.
Careful with your choice of data types or you will lose a little accuracy each time you do this calculation.
Pseudocode:
newRating = ((oldRating * previousNumberOfRatings) + userRating) / (previousNumberOfRatings + 1)

Django : how to optimize this query on a large table

To give the context, I have a lot of temperature measurements taken at different stations and I want to check if it is in accordance with what was forecast.
My model is :
class Station(models.Model):
station_id = models.CharField(max_length = 18 ,primary_key = True)
sector = models.CharField(max_length = 40)
class Weather(models.Model):
station = models.ForeignKey(Station)
temperature = models.FloatField()
date = models.DateField()
class Forecast(models.Model):
station = models.ForeignKey(Station)
date = models.DateField()
score = models.IntegerField()
For each temperature measurement, I would like to know the average of the forecasting scores for the station over the last 7 days, unless there is another temperature measurement in this time frame, in which case it is the starting point. The following code does what I want but is much too slow to execute (~10minutes !) :
observations = Weather.objects.all().order_by('station','date')
for obs in observations:
try :
if obs.station == previous.station:
date_inf = min(obs.date- timedelta(days=7), previous.date)
else :
date_inf = obs.date- timedelta(days=7)
except UnboundLocalError :
date_inf = obs.date- timedelta(days=7)
forecast = Forecast.objects.filter(
station=obs.station
).filter(
date__gte = date_inf
).filter(
date__lte = obs.date - timedelta(days=1)
).aggregate(average_score=Avg('score'))
if forecast["average_score"] is not None:
print(forecast["average_score"],obs.rating)
# Some more code....
previous = obs
How can I optimize the execution time ? Is there a way to do it with a single query ?
Thanks !
For every measurement, you re-compute the average of last 7 days. If your measurements are closer together than 7 days, you will have overlap. E.g. if your measurements are 1 day apart then you re-calculate average on each object 6 times in the database which is SLOW.
Your best bet is to grab all measurements, then all forecasts that match, then do the averaging in memory in Python. Sure, more python code, but it will run faster.