django - query for the closest numeric value to a given value - django

I have a django model Level for a game.
class Level(models.Model):
key = models.CharField(max_length=100)
description = models.CharField(max_length=500)
requiredPoints = models.IntegerField()
badgeurl = models.CharField(max_length=100)
challenge = models.ForeignKey(Challenge)
I now want to query for the highest level with a pointsRequired value smaller than a given value.
If i have:
Level 1: requiredPoints: 200
Level 2: requiredPoints: 800
Level 3: requiredPoints: 2000
When I enter e.g. 900 or 1999 as a query parameter, I want level 2 to be returned, when entering 10000 it should be level 3.
in sql it would look like
select pointsRequired,
abs(pointsRequired - parameter) as closest
from the_table
order by closest
limit 1
Any tips? Do I have to use an extra Query-Set? How would it look like

I don't think your SQL is correct. It should return 'level 3' if the parameter is 1999. Based on your description the SQL could be:
SELECT pointsRequired FROM the_table WHERE pointsRequired < parameter ORDER BY pointsRequired DESC LIMIT 1;
Or in Django:
try:
Level.objects.filter(requiredPoints__lt=parameter).order_by('-requiredPoints')[0]
except IndexError:
# Do something

Related

How to get Cartesian product of two tables in Django Queryset?

Is there a way to do the equivalent of a full outer join in Django (I think I've read that full outer joins are not supported).
My scenario is that I have three tables:
Staff / WeekList / WeeksCompleted
The relevant fields I'm trying to work with are:
Staff table - Staff Number.
WeekList table - Week Start date.
WeeksCompleted table - Week Start date and Staff Number.
Basically, everyone should have an entry in the WeeksCompleted table (if they're still active, that is, but that's not pertinent for this question). The queryset I would like to produce is a list of Staff who have missing weeks in the WeeksCompleted table.
I can get the result I want using SQL queries but it involves a full outer join on the Staff and WeekList tables. I was wondering if anyone knows of a way to do this using the queryset functions?
The only other way I can think to do the equivalent of the full join is to create a list using a nested loop of Staff Numbers against each week, which might have a sizeable processing overhead?
EDIT: if it helps, here are the three simplified models.
models.py
class Staff(models.Model):
staff_number = models.CharField(max_length=9, null=True)
class WeekList(models.Model):
week_start = models.DateField(null=True)
class WeeksCompleted(models.Model):
staff = models.ForeignKey(to='weekscompleted.Staff', null=True, on_delete=models.PROTECT)
week_list = models.ForeignKey(to='weekscompleted.WeekList', null=True, on_delete=models.PROTECT)
EDIT 2: The join I think I need is:
SELECT staff_number, week_start
FROM Staff, Contractor
GROUP BY staff_number, week_start
This will give a list of the expected weeks completed for staff:
week_start staff_number
17/10/2020 12345
17/10/2020 54321
I can then compare this to the WeeksCompleted table:
week_start staff_number
17/10/2020 12345
to find which staff are missing for a week using this query (keep in mind that this is a query I produced in a database):
SELECT qryShouldBeCompleted.week_start, qryShouldBeCompleted.staff_number
FROM qryShouldBeCompleted
LEFT JOIN WeeksCompleted ON qryShouldBeCompleted.staff_number =
WeeksCompleted.staff_number
AND qryShouldBeCompleted.week_start = WeeksCompleted.week_start
WHERE WeeksCompleted.staff_number Is Null
This would then produce the result I need:
week_start staff_number
17/10/2020 54321
Edit 3:
I just found an article on FilteredRelation that gets me partway there:
Staff.objects.annotate(missing=FilteredRelation('weekscompleted', condition=Q(weekscompleted__week_start='some date'))).values('staff_number', 'missing__staff__staff_number', 'missing__week_start')
which gets me this:
{'staff_number': '54321', 'missing__staff__staff_number': None, 'missing__week_start': None}
The only thing with this is that it only appears to work for one week at a time - using __lte in the condition doesn't return any 'None' values so I'd have to loop through each week...

Are queries using related_name more performant in Django?

Lets say I have the following models set up:
class Shop(models.Model):
...
class Product(models.Model):
shop = models.ForeignKey(Shop, related_name='products')
Now lets say we want to query all the products from the shop with label 'demo' whose prices are below $100. There are two ways to do this:
shop = Shop.objects.get(label='demo')
products = shop.products.filter(price__lte=100)
Or
shop = Shop.objects.get(label='demo')
products = Products.objects.filter(shop=shop, price__lte=100)
Is there a difference between these two queries? The first one is using the related_name property. I know foreign keys are indexed, so searching using them should be faster, but is this applicable in our first situation?
Short answer: this will result in equivalent queries.
We can do the test by printing the queries:
>>> print(shop.products.filter(price__lte=100).query)
SELECT "app_product"."id", "app_product"."shop_id", "app_product"."price" FROM "app_product" WHERE ("app_product"."shop_id" = 1 AND "app_product"."price" <= 100)
>>> print(Product.objects.filter(shop=shop, price__lte=100).query)
SELECT "app_product"."id", "app_product"."shop_id", "app_product"."price" FROM "app_product" WHERE ("app_product"."price" <= 100 AND "app_product"."shop_id" = 1)
except that the conditions in the WHERE are swapped, the two are equal. But usually this does not make any difference at the database side.
If you however are not interested in the Shop object itself, you can filter with:
products = Product.objects.filter(shop__label='demo', price__lte=100)
This will make a JOIN at the database level, and will thus retrieve the data in a single pass:
SELECT "app_product"."id", "app_product"."shop_id", "app_product"."price"
FROM "app_product"
INNER JOIN "app_shop" ON "app_product"."shop_id" = "app_shop"."id"
WHERE "app_product"."price" <= 100 AND "app_shop"."label" = demo

How to use an aggregate in a case statement in Django

I am trying to use an aggregated column in a case statement in Django and I am having no luck getting Django to accept it.
The code is to return a list of people who have played a game, the number of times they have played the game and their total score. The list is sorted by total score descending. However, the game has a minimum number of plays in order to qualify. Players without sufficient plays are listed at the bottom. For example:
Player Total Plays
Jill 109 10
Sam 92 11
Jack 45 9
Sue 50 3
Sue is fourth in the list because her number of plays (3) is less than the minimum (5).
The relevant models and function are:
class Player(models.Model):
name = models.CharField()
class Game(models.Model):
name = models.CharField()
min_plays = models.IntegerField(default=1)
class Play(models.Model):
game = models.ForeignKey(Game)
class Score(models.Model):
play = models.ForeignKey(Play)
player = models.ForeignKey(Player)
score = models.IntegerField()
def game_standings(game):
query = Player.objects.filter(score__play__game_id=game.id)
query = query.annotate(plays=Count('score', filter=Q(score__play__game_id=self.id)))
query = query.annotate(total_score=Sum('score', filter=Q(score__play__game_id=self.id)))
query = query.annotate(sufficient=Case(When(plays__ge=game.minimum_plays, then=1), default=0)
query = query.order_by('-sufficient', '-total_score', 'plays')
When the last annotate method is hit, a "Unsupported lookup 'ge' for IntegerField or join on the field not permitted" error is reported. I tried to change the case statement to embed the count instead of using the annotated field:
query = query.annotate(
sufficient=Case(When(
Q(Count('score', filter=Q(score__play__game_id=game.id)))> 3, then=1), default=0
)
)
but Django reports a TypeError with '>' and Q and int.
The SQL I am trying to get to is:
SELECT "player"."id",
"player"."name",
COUNT("score"."id") FILTER (WHERE "play"."game_id" = 8) AS "plays",
SUM("score"."score") FILTER (WHERE "play"."game_id" = 8) AS "total_score",
case when COUNT("score"."id") FILTER (WHERE "play"."game_id" = 8) >= 5 then 1
else 0
end as sufficient
FROM "player"
LEFT OUTER JOIN "score" ON ("player"."id" = "score"."player_id")
LEFT OUTER JOIN "play" ON ("score"."play_id" = "play"."id")
WHERE "play"."game_id" = 8
GROUP BY "player"."id"
ORDER BY sufficent desc, total_score desc
I can't seem to figure out how to have the case statement use to play count.
Thanks

Django: Filtering Annotations not working

I've been having issues filtering annotations. The models are:
class VenueBookmark(models.Model):
venue = models.ForeignKey(Venue)
cost_per_guest = models.DecimalField(max_digits=19, decimal_places=2, blank=True, null=True)
class Venue(DateAwareModel):
name = models.CharField(max_length=255)
And in my view I basically have:
venues = venues.annotate(min_cost=Min('venuebookmark__cost_per_guest'))
venues = venues.filter(min_cost__lte=user_input)
However, I still get results which are > user_input. Any insights on this?
Edit:
I've tried converting user_input to Decimal type, but still get the same result:
venues = venues.filter(min_cost__lte=Decimal(user_input))
This is also most of the SQL I got from the Django Debug Toolbar:
SELECT "venue_search_venue"."name", MIN("bookmarks_venuebookmark"."cost_per_guest") AS "min_cost"
FROM "venue_search_venue"
LEFT OUTER JOIN "bookmarks_venuebookmark" ON ( "venue_search_venue"."id" = "bookmarks_venuebookmark"."venue_id" )
GROUP BY "venue_search_venue"."id", "venue_search_venue"."name"
HAVING MIN("bookmarks_venuebookmark"."cost_per_guest") <= %s
DESC LIMIT 9' - PARAMS = ("Decimal('100')",)
After querying the database directly and comparing the results, I suspect the issue is Django renders the user_input as a string instead of a number, e.g.
HAVING MIN("bookmarks_venuebookmark"."cost_per_guest") <= '100'
instead of:
HAVING MIN("bookmarks_venuebookmark"."cost_per_guest") <= 100
This is a funny question. The answer is simple, you should change min by max:
venues = venues.annotate(max_cost=Max('venuebookmark__cost_per_guest'))
venues = venues.filter(max_cost__lte=max_cost)
Illustrating it with an example:
venues VenueBookmark Cost
1 5
1 7
With your actual condition, if we take 6 as max_cost, venues number 1 will appears in result-set because min of VenueBookmark Cost is 5 and 5 < 6.
Changing min by max, venues number 1 will not be included in results, because 7 is not < than 6
Edit
Also, include order_by() to clear default ordering:
venues = (venues
.order_by()
.annotate(min_cost=Min('venuebookmark__cost_per_guest'))
)
Default ordering injected sort fields on group by clause.

Django, accessing PostgreSQL sequence

In a Django application I need to create an order number which looks like: yyyymmddnnnn in which yyyy=year, mm=month, dd=day and nnnn is a number between 1 and 9999.
I thought I could use a PostgreSQL sequence since the generated numbers are atomic, so I can be sure when the process gets a number that number is unique.
So I created a PostgreSQL sequence:
CREATE SEQUENCE order_number_seq
INCREMENT 1
MINVALUE 1
MAXVALUE 9999
START 1
CACHE 1
CYCLE;
This sequence can be accessed as a tables having one row. So in the file checkout.py I created a Django model to access this sequence.
class OrderNumberSeq(models.Model):
"""
This class maps to OrderNumberSeq which is a PostgreSQL sequence.
This sequence runs from 1 to 9999 after which it restarts (cycles) at 1.
A sequence is basically a special single row table.
"""
sequence_name = models.CharField(max_length=128, primary_key=True)
last_value = models.IntegerField()
increment_by = models.IntegerField()
max_value = models.IntegerField()
min_value = models.IntegerField()
cache_value = models.IntegerField()
log_cnt = models.IntegerField()
is_cycled = models.BooleanField()
is_called = models.BooleanField()
class Meta:
db_table = u'order_number_seq'
I set the sequence_name as primary key as Django insists on having a primary key in a table.
The I created a file get_order_number.py with the contents:
def get_new_order_number():
order_number = OrderNumberSeq.objects.raw("select sequence_name, nextval('order_number_seq') from order_number_seq")[0]
today = datetime.date.today()
year = u'%4s' % today.year
month = u'%02i' % today.month
day = u'%02i' % today.day
new_number = u'%04i' % order_number.nextval
return year+month+day+new_number
now when I call 'get_new_order_number()' from the django interactive shell it behaves as expected.
>>> checkout.order_number.get_new_order_number()
u'201007310047'
>>> checkout.order_number.get_new_order_number()
u'201007310048'
>>> checkout.order_number.get_new_order_number()
u'201007310049'
You see the numbers nicely incrementing by one every time the function is called. You can start multiple interactive django sessions and the numbers increment nicely with no identical numbers appearing in the different sessions.
Now I try to use call this function from a view as follows:
import get_order_number
order_number = get_order_number.get_new_order_number()
and it gives me a number. However next time I access the view, it increments the number by 2. I have no idea where the problem is.
The best solution I can come up with is: don't worry if your order numbers are sparse. It should not matter if an order number is missing: there is no way to ensure that order numbers are contiguous that will not be subject to a race condition at some point.
Your biggest problem is likely to be convincing the pointy-haired ones that having 'missing' order numbers is not a problem.
For more details, see the Psuedo-Key Neat Freak entry in SQL Antipatterns. (note, this is a link to a book, which the full text of is not available for free).
Take a look at this question/answer Custom auto-increment field in postgresql (Invoice/Order No.)
You can create stored procedures using RawSql migration.