Evaluating the same object through span relation in qs.excude()

Evaluating the same object through span relation in qs.excude() - django

Consider a Hotel with Room objects and Reservation objects for those rooms. I want to find which rooms are available at a given period or (particularly in the example below) from which date onward.
A reservation can be 'deleted', which is obtained by settings the "Live" field. So they're not actually deleted but just inactive, and this would need to remain this way.
>>> indate = "20141225"
>>> Room.objects.exclude(
(models.Q(reservation__live=True , reservation__date_out__gt=indate)
| models.Q(reservation__date_out__isnull=True, reservation__live=True))
)
Problem statement: The above code has an unfortunate side effect: when there is a reservation with live=True that is outside the period, and when there's another reservation live=False within the period for the same room, then that Room will be excluded. This should not be the case: since the reservation within the period I'm asking for is not set to live=True, it should be not be taken into account.
It looks like my query above is not considering the same reservation when doing the (live=True and date_out__gt=indate) comparison through the room-reservation relation.
Question: is there a way, within the exclude(), to ensure the same reservation is considered in the comparison?
I tried playing around with negative models.Q (~Models.Q) to no avail.
Note the above is just a code extract (where the issue resides) out of a much larger query. Therefore I cannot simply do qs.filter(reservation__live=True). Moving over the query to filter instead of exclude doesn't seem to be an option.
Edit: adding my simplified models on request, below.
Edit2: adding some test data explaining the issue, and adding a chained exclude() as per #knbk's suggestion.
class AvailableRoomManager(models.Manager):
def available_with_Q(self, indate, outdate):
qs = super(AvailableRoomManager, self).get_queryset()
if indate and outdate:
qs = qs.exclude(models.Q(reservation__date_out__gt=indate, reservation__date_in__lt=outdate, reservation__live=True)
| models.Q(reservation__date_out__isnull=True, reservation__date_in__isnull=False, reservation__date_in__lt=outdate, reservation__live=True))
elif indate and not outdate:
qs = qs.exclude((models.Q(reservation__date_out__gt=indate, reservation__date_in__isnull=False, reservation__live=True)
| models.Q(reservation__date_out__isnull=True, reservation__date_in__isnull=False, reservation__live=True)))
return qs
def available_with_chained_excludes(self, indate, outdate):
qs = super(AvailableRoomManager, self).get_queryset()
if indate and outdate:
qs = qs.exclude(reservation__date_out__gt=indate, reservation__date_in__lt=outdate, reservation__live=True) \
.exclude(reservation__date_out__isnull=True, reservation__date_in__isnull=False, reservation__date_in__lt=outdate, reservation__live=True)
elif indate and not outdate:
qs = qs.exclude(reservation__date_out__gt=indate, reservation__date_in__isnull=False, reservation__live=True) \
.exclude(reservation__date_out__isnull=True, reservation__date_in__isnull=False, reservation__live=True)
return qs
class Room(models.Model):
name = models.CharField(max_length=30, unique=True)
objects = models.Manager()
available_rooms = AvailableRoomManager()
def __str__(self):
return self.name
class Reservation(models.Model):
date_in = models.DateField()
date_out = models.DateField(blank=True, null=True)
room = models.ForeignKey(Room)
live = LiveField() # See django-livefield; to do deletion. Basically adds a field "live" to the model.
objects = LiveManager()
all_objects = LiveManager(include_soft_deleted=True)
The problem pops up in the exclude() statements above when there is an active (live=True) outside the period searching for, and when there is an inactive (live != True) inside the period I'm searching for.
Some simple test data using the above models, showing what the issue is about:
# Let's make two rooms, R001 and R002
>>> room1 = Room.objects.get_or_create(name="R001")[0]
>>> room2 = Room.objects.get_or_create(name="R002")[0]
# First reservation, with no date_out, is created but then "deleted" by setting field 'live' to False
>>> res1 = Reservation.objects.get_or_create(date_in="2014-12-01", date_out=None, room=room1)[0]
>>> res1.live = False
>>> res1.save()
# Second reservation in same room is created with date_out set to Dec 15th
>>> res2 = Reservation.objects.get_or_create(date_in="2014-12-01", date_out="2014-12-15", room=room1)[0]
# Here I'd expect to have R001 listed as well... this is not the case
>>> Room.available_rooms.available_with_Q("2014-12-16", "")
[<Room: R002>]
>>> Room.available_rooms.available_with_chained_excludes("2014-12-16", "")
[<Room: R002>]
# As a test, when changing "deleted" res1's Room to room2, the manager does return R001
>>> res1.room = room2
>>> res1.save()
>>> Room.available_rooms.available_with_Q("2014-12-16", "")
[<Room: R001>, <Room: R002>]
>>> Room.available_rooms.available_with_chained_excludes("2014-12-16", "")
[<Room: R001>, <Room: R002>]

I tested it with your github project, and I got the same results as you did. It seems that, while filter correctly converts multiple related-object filters to an INNER JOIN, exclude converts it in some sort of subquery where each filter (even within the same call) is checked on its own.
I've found a workaround for this, and that is to explicitly create the subquery of reservations:
elif indate and not outdate:
ress = Reservation.objects.filter(Q(live=True, date_in__isnull=False), Q(date_out__gt=indate) | Q(date_out__isnull=True))
rooms = Room.objects.exclude(reservation__in=ress)
etc...
Btw, if you ever need to use filter instead of exclude, the following queries are always identical, and in fact, this is what Django does internally:
Room.objects.exclude(<some_filter>)
Room.objects.filter(~Q(<some_filter>))

You want to not allow a checkin if there is already a reservation in that time, or a user would like to checkout when there is already a reservation placed:
first case : R_IN User_IN R_out
second case: R_IN User_IN (the room is considered busy, so no more check ins)
third case : User_IN R_IN User_OUT
Room.objects.exclude(
models.Q(reservation__live=False, reservation__date_in__lte=indate, reservation__date_out__gte=indate)
|
models.Q(reservation__live=False, reservation__date_in__lte=indate, reservation__date_out__isnull=True)
|
models.Q(reservation__live=False, reservation__date_in__gte=indate, reservation__date_in__lte=outdate)
)
if you need to take into account the case where outdate is not available you could simply factor out the last case, handling how it's best for you, maybe removing that room altogether because of possible conflicts or forcing the user to leave a day before the reservation you already have checks in

Related

Django Rest Framework Filtering an object in get_queryset method

Basically, I have a catalog viewset. In the list view I want to make a few filtering and return accordingly.
Relevant Catalog model fields are:
class Catalog(models.Model):
name = models.CharField(max_length=191, null=True, blank=False)
...
team = models.ForeignKey(Team, on_delete=models.CASCADE, editable=False, related_name='catalogs')
whitelist_users = models.JSONField(null=True, blank=True, default=list) # If white list is null, it is open to whole team
Views.py
class CatalogViewSet(viewsets.ModelViewSet):
permission_classes = (IsOwnerAdminOrRestricted,)
def get_queryset(self):
result = []
user = self.request.user
catalogs = Catalog.objects.filter(team__in=self.request.user.team_set.all())
for catalog in catalogs:
if catalog.whitelist_users == [] or catalog.whitelist_users == None:
# catalog is open to whole team
result.append(catalog)
else:
# catalog is private
if user in catalog.whitelist_users:
result.append(catalog)
return result
So this is my logic;
1 - Get the catalog object if catalog's team is one of the current user' team.
2 - Check if the catalog.whitelist_users contains the current user. (There is also an exception that if it is none means it s open to whole team so I can show it in the list view.)
Now this worked but since I am returning an array, it doesn't find the detail objects correctly. I mean /catalog/ID doesn't work correctly.
I am new to DRF so I am guessing there is something wrong here. How would you implement this filtering better?

As the name of the method suggests, you need to return a queryset. Also, avoid iterating over a queryset if that's not necessary. It's better to do it in a single database hit. For complex queries, you can use the Q object.
from django.db.models import Q
# ...
def get_queryset(self):
user = self.request.user
catalogs = Catalog.objects.filter(
Q(whitelist_users__in=[None, []]) | Q(whitelist_users__contains=user),
team__in=user.team_set.all())
return catalogs
Now I am not 100% sure the whitelist_users__contains=user will work since it depends on how you construct your JSON, but the idea is there, you will just need to adapt what it contains.
This will be much more effective than looping in python and will respect what get_queryset is meant for.

A simple solution that comes to mind is just creating a list of PKs and filtering again, that way you return a Queryset. Not the most efficient solution, but should work:
def get_queryset(self):
pks = []
user = self.request.user
catalogs = Catalog.objects.filter(team__in=user.team_set.all())
for catalog in catalogs:
if catalog.whitelist_users == [] or catalog.whitelist_users == None:
# catalog is open to whole team
pks.append(catalog.pk)
else:
# catalog is private
if user in catalog.whitelist_users:
pks.append(catalog.pk)
return Catalog.objects.filter(id__in=pks)

Django - Q object search by data in an associated model

I have a Q object that queries my database that works something like this:
class EventSearchManager(models.Manager):
q_objects = []
terms = [term.strip() for term in search_terms.split()]
today = date.today()
if timeselect == "Today":
first_day = today
last_day = None
for term in terms:
search = (
Q(name__icontains=term) |
Q(tags__label__icontains=term),
)
if first_day is not None:
operators.update({'start_date__gte': first_day})
if last_day is not None:
operators.update({'start_date__lte': last_day})
q_objects.append(Q(*search, **operators))
qs = self.get_queryset()
return qs.filter(reduce(operator.or_, q_objects))
It works well, but I've just refactored the Events so that start_date exists in a separate EventInstance model (this way an event can have an indefinite amount of start dates).
Now I would like to adapt this search to return Event objects so that operators.update({'start_date__gte': first_day}) references the start_date of all associated EventInstance objects. Is there an easy syntax adjustment I can make, or will I need to reconstruct this process entirely? Or am I simply asking too much of the Q object?
This is my EventInstance model which establishes the relationship:
class EventInstance(models.Model):
event = models.ForeignKey(Event)
start = models.DateTimeField()
duration = models.TimeField()
recurring = models.CharField(max_length=2)

Q objects are exactly the same as a normal filter condition. Since you can follow relationships in a filter condition, you can do it in a Q as well.
You don't show your models, but assuming the relation is just called eventinstance, you can do:
operators.update({'eventinstance__start_date__gte': first_day})

Django - preventing duplicate records

I have a list of client records in my database. Every year, we generate a single work order for each client. Then, for each work order record, the user should be able to create a note that is specific to the work order. However, not all work orders need a note, just some.
Now, I can't simply add a note field to the work order because some times, we need to create the note before the work order is even generated. Sometimes this note is specific to a work order that won't happen for 2-3 years. Thus, the notes and the work order must be independent, although they will "find" each other when they both exist.
OK, so here's the situation. I want the user to be able to fill out a very simple note form, where they have two fields: noteYear and note. Thus, all they do is pick a year, and then write the note. The kicker is that the user should not be able to create two notes for the same year for the same client.
What I'm trying to get as is validating the note by ensuring that there isn't already a note for that year for that client. I'm assuming this would be achieved by a custom is_valid method within the form, but I can't figure out how to go about doing that.
This is what I tried so far (note that I know it's wrong, it doesn't work, but it's my attempt so far):
Note that systemID is my client record
My model:
class su_note(models.Model):
YEAR_CHOICES = (
('2013', 2013),
('2014', 2014),
('2015', 2015),
('2016', 2016),
('2017', 2017),
('2018', 2018),
('2019', 2019),
('2020', 2020),
('2021', 2021),
('2022', 2022),
('2023', 2023),
)
noteYear = models.CharField(choices = YEAR_CHOICES, max_length = 4, verbose_name = 'Relevant Year')
systemID = models.ForeignKey(System, verbose_name = 'System ID')
note = models.TextField(verbose_name = "Note")
def __unicode__(self):
return u'%s | %s | %s' % (self.systemID.systemID, self.noteYear, self.noteType)
And my form:
class SU_Note_Form(ModelForm):
class Meta:
model = su_note
fields = ('noteYear', 'noteType', 'note')
def is_valid(self):
valid = super (SU_Note_Form, self).is_valid()
#If it is not valid, we're done -- send it back to the user to correct errors
if not valid:
return valid
# now to check that there is only one record of SU for the system
sysID = self.cleaned_data['systemID']
sysID = sysID.systemID
snotes = su_note.objects.filter(noteYear = self.cleaned_data['noteYear'])
for s in snotes:
if s.systemID == self.systemID:
self._errors['Validation_Error'] = 'There is already a startup note for this year'
return False
return True
EDIT -- Here's my solution (thanks to janos for sending me in the right direction)
My final form looks like this:
class SU_Note_Form(ModelForm):
class Meta:
model = su_note
fields = ('systemID', 'noteYear', 'noteType', 'note')
def clean(self):
cleaned_data = super(SU_Note_Form, self).clean()
sysID = cleaned_data['systemID']
sysID = sysID.systemID
try:
s = su_note.objects.get(noteYear = cleaned_data['noteYear'], systemID__systemID = sysID)
print(s)
self.errors['noteYear'] = "There is already a note for this year."
except:
pass
return cleaned_data
For anyone else looking at this code, the only confusing part is the line that has: sysID = sysID.systemID. The systemID is actually a field of another model - even though systemID is also a field of this model -- poor design, probably, but it works.

See this page in the Django docs:
https://docs.djangoproject.com/en/dev/ref/forms/validation/
Since your validation logic depends on two fields (the year and the systemID), you need to implement this using a custom cleaning method on the form, for example:
def clean(self):
cleaned_data = super(SU_Note_Form, self).clean()
sysID = cleaned_data['systemID']
sysID = sysID.systemID
try:
su_note.objects.get(noteYear=cleaned_data['noteYear'], systemID=systemID)
raise forms.ValidationError('There is already a startup note for this year')
except su_note.DoesNotExist:
pass
# Always return the full collection of cleaned data.
return cleaned_data

django - weird results (cached?) obtained while storing calculated values in fields at model level

Dear django gurus,
please grab my nose and stick it in where my silly mistake glows.
I was about to proceed some simple math operation based on existing field values and store it in a separate field (subtotal) of current instance.
My question is actually included in the very last comment of code.
class Element(models.Model):
name = models.CharField(max_length=128)
kids = models.ManyToManyField('self', null=True, blank=True, symmetrical=False)
price = models.IntegerField('unit price', null=True, blank=True)
amount = models.IntegerField('amount', null=True, blank=True)
subtotal = models.IntegerField(null=True, blank=True)
def counter(self):
output = 0
# Check if this is the lowest hierarchy level (where the intention is to
# store both values for price and amount) and proceed calculations
# based on input values. Also check if values are set to avoid operations
# with incompatible types.
# Else aggregate Sum() on subtotal values of all kids.
if self.kids.count() == 0:
if self.price == None or self.amount == None:
output = 0
else:
output = self.price * self.amount
else:
output = self.kids.aggregate(Sum('subtotal'))['subtotal__sum']
self.subtotal = output
def __unicode__(self):
return self.name
This is how my sample data look like (I am sorry if I am missing some convention of how to show it).
element.name = son
element.kids = # None
element.price = 100
element.amount = 1
element.subtotal = 100 # Calculates and stores correct value.
element.name = daughter
element.kids = # None
element.price = 200
element.amount = 5
element.subtotal = 1000 # Calculates and stores correct value.
element.name = father
element.kids = son, daughter
element.price = # None. Use this field for overriding the calculated
# price at this level.
element.amount = # None.
element.subtotal = 25 # Calculates completely irrelevant value.
# The result seems to be some previous state
# (first maybe) of subtotal of one of the kids.
# This is where my cache part of question comes from.
While solving this simple task I have started with clean() class, but the results were calculated after second save only (maybe a good topic for another question). I switched then to custom method. But now after a night spent on this I would admiteddly use Mr. Owl's words to Winnie the Pooh: "You sir are stuck!". At this point I am using only native django admin forms. Any help will be most appreciated.

Without seeing the code you are using to invoke all these calculations, I can only guess what the problem might be. It may be helpful to show your view code (or other), which 'kicks off' the calculations of subtotal.
There are two potential issues I can see here.
The first is in your counter method. Are you saving the model instance after calculating the total?
def counter(self):
output = 0
if self.kids.count() == 0:
if self.price == None or self.amount == None:
output = 0
else:
output = self.price * self.amount
else:
output = self.kids.aggregate(Sum('subtotal'))['subtotal__sum']
self.subtotal = output
self.save() # save the calculation
Without saving the instance, when you query the children from the parent, you will get the current database value rather than the newly calculated value. This may or may not be an issue, depending on the code that calls counter to begin with.
The other potential issue I see here is cache invalidation. If a child updates their price or amount, is the parent notified so it can also update its subtotal? You may be able to override the save method to do your calculations at the last minute.
def save(self, *args, **kwargs):
self.counter() # calculate subtotal
super(Element, self).save(*args, **kwargs) # Call the "real" save() method.
for parent in self.element_set.all(): # use a related name instead
parent.save() # force recalculation of each parent
You'll note that this will only force the correct values of subtotal to be valid after saving only. Be aware that overriding the save method in this way is directly contradictory to my first solution of saving the instance when calculating counter. If you use both, together, you will get a StackOverflow as counter is calculated then saved, then during saving counter is calculated, which will trigger another save. Bad news.

Reducing queries for manytomany models in django

EDIT:
It turns out the real question is - how do I get select_related to follow the m2m relationships I have defined? Those are the ones that are taxing my system. Any ideas?
I have two classes for my django app. The first (Item class) describes an item along with some functions that return information about the item. The second class (Itemlist class) takes a list of these items and then does some processing on them to return different values. The problem I'm having is that returning a list of items from Itemlist is taking a ton of queries, and I'm not sure where they're coming from.
class Item(models.Model):
# for archiving purposes
archive_id = models.IntegerField()
users = models.ManyToManyField(User, through='User_item_rel',
related_name='users_set')
# for many to one relationship (tags)
tag = models.ForeignKey(Tag)
sub_tag = models.CharField(default='',max_length=40)
name = models.CharField(max_length=40)
purch_date = models.DateField(default=datetime.datetime.now())
date_edited = models.DateTimeField(auto_now_add=True)
price = models.DecimalField(max_digits=6, decimal_places=2)
buyer = models.ManyToManyField(User, through='Buyer_item_rel',
related_name='buyers_set')
comments = models.CharField(default='',max_length=400)
house_id = models.IntegerField()
class Meta:
ordering = ['-purch_date']
def shortDisplayBuyers(self):
if len(self.buyer_item_rel_set.all()) != 1:
return "multiple buyers"
else:
return self.buyer_item_rel_set.all()[0].buyer.name
def listBuyers(self):
return self.buyer_item_rel_set.all()
def listUsers(self):
return self.user_item_rel_set.all()
def tag_name(self):
return self.tag
def sub_tag_name(self):
return self.sub_tag
def __unicode__(self):
return self.name
and the second class:
class Item_list:
def __init__(self, list = None, house_id = None, user_id = None,
archive_id = None, houseMode = 0):
self.list = list
self.house_id = house_id
self.uid = int(user_id)
self.archive_id = archive_id
self.gen_balancing_transactions()
self.houseMode = houseMode
def ret_list(self):
return self.list
So after I construct Itemlist with a large list of items, Itemlist.ret_list() takes up to 800 queries for 25 items. What can I do to fix this?

Try using select_related
As per a question I asked here

Dan is right in telling you to use select_related.
select_related can be read about here.
What it does is return in the same query data for the main object in your queryset and the model or fields specified in the select_related clause.
So, instead of a query like:
select * from item
followed by several queries like this every time you access one of the item_list objects:
select * from item_list where item_id = <one of the items for the query above>
the ORM will generate a query like:
select item.*, item_list.*
from item a join item_list b
where item a.id = b.item_id
In other words: it will hit the database once for all the data.

You probably want to use prefetch_related
Works similarly to select_related, but can deal with relations selected_related cannot. The join happens in python, but I've found it to be more efficient for this kind of work than the large # of queries.
Related reading on the subject

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Evaluating the same object through span relation in qs.excude() - django

Related

Django Rest Framework Filtering an object in get_queryset method

Django - Q object search by data in an associated model

Django - preventing duplicate records

django - weird results (cached?) obtained while storing calculated values in fields at model level

Reducing queries for manytomany models in django

Categories

Resources