Django - Unexpected behaviour with default ordering after annotations

Django - Unexpected behaviour with default ordering after annotations - django

I've discovered a rather odd case where if I set default ordering on a model to id or -id then add distinct annotations to the query, the default ordering is ignored and instead orders by id ascending regardless of what it is set as in the model Meta.
However, when I choose a field that isn't specifically id as the default ordering of the model and do the same thing, the queryset is ordered correctly. It only seems to by id.
What gives? I'm not sure if this is Django weirdness or postgres weirdness, is it because it's the primary key? If I use .order_by('-id') afterwards it orders as desired and if the ordering gets broken by annotating it, how come it doesn't always break?
Example
Django version: 4.1
Postgres version: 13.4
class OrderQuerySet(models.QuerySet):
def annotateItemCounts(self):
return self.annotate(
num_items=Count('order_items', distinct=True),
num_items_draft=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.DRAFT)),
num_items_back_order=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.BACK_ORDER)),
num_items_confirmed=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.CONFIRMED)),
num_items_in_progress=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.IN_PROGRESS)),
num_items_ready=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.READY)),
num_items_packed=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.PACKED)),
num_items_shipped=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.SHIPPED)),
num_items_completed=Count('order_items', distinct=True, filter=Q(order_items__state=OrderItem.OrderItemState.COMPLETE)),
)
class OrderManager(models.Manager):
def get_queryset(self):
return OrderQuerySet(self.model, using=self._db)
def annotateItemCounts(self):
return self.get_queryset().annotateItemCounts()
class Order(models.Model):
class Meta:
ordering = ['-id']
...

Related

Django ORM Need help speeding up query, connected to additional tables

Running Django 1.6.5 (very old i know but can't upgrade at the moment since this is production).
I'm working on a view where I need to perform a query and get data from a couple other tables which have the same field on it (though on the other tables the ord_num key may exist multiple times, they are not foreign keys).
When I attempt to render this queryset into the view, it takes a very long time.
Any idea how i can speed this up?
Edit: The slowdown seems to be from the pickconn lookup but i can't speed it up. The Oracle DB itself doesn't have foreign keys on the Pickconn table but i figured it could speed things up in Django...
view queryset:
qs = Outordhdr.objects.filter(
status__in=[10, 81],
ti_type='#'
).exclude(
ord_num__in=Shipclosewq.objects.values('ord_num')
).filter(
ord_num__in=Pickconhdr.objects.values_list('ord_num', flat=True)
).order_by(
'sch_shp_dt', 'wave_num', 'shp_dock_num'
)
Models file:
class Outordhdr(models.Model):
ord_num = models.CharField(max_length=13, primary_key=True)
def get_conts_loaded(self):
return self.pickcons.filter(cont_dvrt_flg__in=['C', 'R']).aggregate(
conts_loaded=models.Count('ord_num'),
last_conts_loaded=models.Max('cont_scan_dt')
)
#property
def conts_left(self):
return self.pickcons.exclude(cont_dvrt_flg__in=['C', 'R']).aggregate(
conts_left=models.Count('ord_num')).values()[0]
#property
def last_conts_loaded(self):
return self.get_conts_loaded().get('last_conts_loaded', 0)
#property
def conts_loaded(self):
return self.get_conts_loaded().get('conts_loaded', 0)
#property
def tot_conts(self):
return self.conts_loaded + self.conts_left
#property
def minutes_since_last_load(self):
if self.last_conts_loaded:
return round((get_db_current_datetime() - self.last_conts_loaded).total_seconds() / 60)
class Meta:
db_table = u'outordhdr'
class Pickconhdr(models.Model):
ord_num = models.ForeignKey(Outordhdr, db_column='ord_num', max_length=13, related_name='pickcons')
cont_num = models.CharField(max_length=20, primary_key=True)
class Meta:
db_table = u'pickconhdr'

From reading this query and looking at the documentation it seems like the best way to optomise would be to add indexes onto non unique fields, in this case i would recommend to index:
order_num, ti_type, status, sch_shp_td, wave_num and shp_dock_num
doing this will increase lookup speed for all of these fields which should in turn allow the queryset to run faster.

Probaly you can try like this with isnull:
qs = Outordhdr.objects.filter(
status__in=[10, 81],
ti_type='#'
).filter(
shipcons__ord_num__isnull=True,
pickcons__ord_num__isnull=False
).order_by(
'sch_shp_dt', 'wave_num', 'shp_dock_num'
)
I am assuming shipcons is the related name you have used for relation between Outordhdr and Shipclosewq.
Here I am querying if pickcons has any entry for Outordhdr with isnull False and has no entry(exclude) for Shipclosewq with isnull True.
FYI: You should consider upgrading Django version along with Python version, otherwise the server might be prone to security breaches.

Django QuertySet.annotate() received non-expression - how to add a derived field based on model field?

First time with Django. Trying to add an annotation to queryset:
class EnrollmentManager(models.Manager.from_queryset(EnrollmentCustomQuerySet)):
COURSE_DURATION = datetime.timedelta(days=183)
def get_queryset(self):
"""Overrides the models.Manager method"""
lookback = make_aware(datetime.datetime.today() - self.COURSE_DURATION)
qs = super(EnrollmentManager, self).get_queryset().annotate( \
is_expired=(Value(True)), output_field=models.BooleanField())
return qs
At the moment I am just trying to add an extra 'calculated' field on the returned queryset, which is hard-coded to True and the attribute/field should be called is_expired.
If I can get that to work, then Value(True) needs to be a derived value based on this expression:
F('enrolled') < lookback
But since 'enrolled' is a database field and lookback is calculated, how will I be able to do that?
Note
I tried this, which executes without throwing the error:
qs = super(EnrollmentManager, self).get_queryset().annotate( \
is_expired=(Value(True, output_field=models.BooleanField())))
and in the shell I can see it:
Enrollment.objects.all()[0].is_expired -> returns True
and I can add it to the serializer:
class EnrollmentSerializer(serializers.ModelSerializer):
is_active = serializers.SerializerMethodField()
is_current = serializers.SerializerMethodField()
is_expired = serializers.SerializerMethodField()
COURSE_DURATION = datetime.timedelta(days=183)
class Meta:
model = Enrollment
fields = ('id', 'is_active', 'is_current', 'is_expired')
def get_is_expired(self, obj):
return obj.is_expired
So it is possible...but how can I replace my hard-coded 'True" with a calculation?
UPDATE
Reading the documentation, it states:
"Annotates each object in the QuerySet with the provided list of query expressions. An expression may be a simple value, a reference to a field on the model (or any related models), or an aggregate expression (averages, sums, etc.) that has been computed over the objects that are related to the objects in the QuerySet."
A simple value - so, not a simple COMPUTED value then?
That makes me think this is not possible...

It seems like a pretty good use-case for a Case expression. I suggest getting as familiar as you can with these expression tools, they're very helpful!
I haven't tested this, but it should work. I'm assuming enrolled is a tz-aware datetime for when they first enrolled...
from django.db.models import Case, When, Value
def get_queryset(self):
"""Overrides the models.Manager method"""
lookback = make_aware(datetime.datetime.today() - self.COURSE_DURATION)
qs = super(EnrollmentManager, self).get_queryset().annotate(
is_expired=Case(
When(
enrolled__lt=lookback,
then=Value(True)
),
default=Value(False),
output_field=models.BooleanField()
)
)
You also don't have to pre-calculate the lookback variable. Check out ExpressionWrappers and this StackOverflow answer that addresses this.
ExpressionWrapper(
TruncDate(F('date1')) + datetime.timedelta(days=365),
output_field=DateField(),
)

Annotations in django with model managers

I have two models with an one to many relation.
One model named repairorder, which can have one or more instances of work that is performed on that order.
What I need is to annotate the Repairorder queryset to sum the cummulative Work duration. On the Work model I annotated the duration of a single Work instance based on the start and end date time stamps. Now I need to use this annotated field to sum the total cummulative Work that is performed for each order. I tried to extend the base model manager:
from django.db import models
class WorkManager(models.Manager):
def get_queryset(self):
return super(OrderholdManager, self).get_queryset().annotate(duration=ExpressionWrapper(Coalesce(F('enddate'), Now()) - F('startdate'), output_field=DurationField()))
class Work(models.Model):
#...
order_idorder = models.ForeignKey('Repairorder', models.DO_NOTHING)
startdate = models.DateTimeField()
enddate = models.DateTimeField()
objects = WorkManager()
class RepairorderManager(models.Manager):
def get_queryset(self):
return super(RepairorderexternalManager, self).get_queryset().annotate(totalwork=Sum('work__duration'), output_field=DurationField())
class Repairorder(models.Model):
#...
idrepairorder = models.autofield(primary_key=True)
objects = RepairorderManager()
For each Repairorder I want to display the 'totalwork', however this error appears: QuerySet.annotate() received non-expression(s): . and if I remove the output_field=DurationField() from the RepairorderMananager, it says: Cannot resolve keyword 'duration' into field.
Doing it the 'Python way' by using model properties is not an option with big datasets.

You will need to add the calculation to the RepairorderManager as well:
class RepairorderManager(models.Manager):
def get_queryset(self):
return super(RepairorderexternalManager, self).get_queryset().annotate(
totalwork=ExpressionWrapper(
Sum(Coalesce(F('work__enddate'), Now()) - F('work__startdate')),
output_field=DurationField()
)
)
Django does not take into account annotations introduced by manager on related objects.

Ignore null values in descending order using Django Rest Framework

I am using Django for my website, and hence decided to use Django Rest Framework for building my REST APIs. For a particular model, i want to filter on a text field (using SearchFilter for that), filter on a few categorical fields (FilterBackend with a FilterSet defined) and be able to order data based on some fields (OrderingFilter for this).
class StatsAPI(generics.ListAPIView):
model = Stats
queryset = Stats.objects.all()
serializer_class = StatsSerializer
filter_backends = (filters.DjangoFilterBackend, filters.OrderingFilter, filters.SearchFilter)
filter_class = StatsFilter
pagination_class = StatsPagination
ordering_fields = ('__all__')
search_fields = ('display_name')
The issue i am facing is with my ordering fields as they also contain nulls. Ordering in ascending order works fine. However ordering in descending order (www.example.com/api/stats/?ordering=-appearance), pushes the null values to the top.
How do i ignore the null values when using descending order? The number of fields on which ordering can be performed are roughly 20 in number.

This is a slightly different solution -- rather than filtering null out, this replacement for filters.OrderingFilter just always makes sure they sort last:
class NullsAlwaysLastOrderingFilter(filters.OrderingFilter):
""" Use Django 1.11 nulls_last feature to force nulls to bottom in all orderings. """
def filter_queryset(self, request, queryset, view):
ordering = self.get_ordering(request, queryset, view)
if ordering:
f_ordering = []
for o in ordering:
if not o:
continue
if o[0] == '-':
f_ordering.append(F(o[1:]).desc(nulls_last=True))
else:
f_ordering.append(F(o).asc(nulls_last=True))
return queryset.order_by(*f_ordering)
return queryset

You can custom your own OrderingFilter:
# Created by BaiJiFeiLong#gmail.com at 2022/8/13
from django.db.models import F, OrderBy
from django_filters import rest_framework as filters
class MyOrderingFilter(filters.OrderingFilter):
def get_ordering_value(self, param):
value = super().get_ordering_value(param)
return OrderBy(F(value.lstrip("-")), descending=value.startswith("-"), nulls_last=True)

Will ordering exclude the null values,
assuming your field name is stats here you can do as follows :
Stats.objects.exclude(stats__isnull=True).exclude(stats__exact='')

BaiJiFeiLong's solution almost worked for me. With some tweaks, this ended up doing the trick:
from django.db.models import F, OrderBy
from rest_framework.filters import OrderingFilter
class NullsLastOrderingFilter(OrderingFilter):
def get_ordering(self, request, queryset, view):
values = super().get_ordering(request, queryset, view)
return (OrderBy(F(value.lstrip("-")), descending=value.startswith("-"), nulls_last=True) for value in values)

Adding aggregate over filtered self-join field to Admin list_display

I would like to augment one of my model admins with an interesting value. Given a model like this:
class Participant(models.Model):
pass
class Registration(models.Model):
participant = models.ForeignKey(Participant)
is_going = models.BooleanField(verbose_name='Is going')
Now, I would like to show the number of other Registrations for this Participant where is_going is False. So, something akin to this SQL query:
SELECT reg.*, COUNT(past.id) AS not_going_num
FROM registrations AS reg, registrations AS past
WHERE past.participant_id = reg.participant_id AND
past.is_going = False
I think I can extend the Admin's queryset() method according to Django Admin, Show Aggregate Values From Related Model, by annotating it with the extra Count, but I still cannot figure out how to work the self-join and filter into this.
I looked at Self join with django ORM and Django self join , How to convert this query to ORM query, but the former is doing SELECT * AND the latter seems to have data model problems.
Any suggestions on how to solve this?

See edit history for previous version of the answer.
The admin implementation below will display "Not Going Count" for each Registration model. The "Not Going Count" is the count of is_going=False for the registration's participant.
#admin.register(Registration)
class RegistrationAdmin(admin.ModelAdmin):
list_display = ['id', 'participant', 'is_going', 'ng_count']
def ng_count(self, obj):
return obj.not_going_count
ng_count.short_description = 'Not Going Count'
def get_queryset(self, request):
qs = super(RegistrationAdmin, self).get_queryset(request)
qs = qs.filter(participant__registration__isnull=False)
qs = qs.annotate(not_going_count=Sum(
Case(
When(participant__registration__is_going=False, then=1),
default=0,
output_field=models.IntegerField())
))
return qs
Below is a more thorough explanation of the QuerySet:
qs = qs.filter(participant__registration__isnull=False)
The filter causes Django to perform two joins - an INNER JOIN to participant table, and a LEFT OUTER JOIN to registration table.
qs = qs.annotate(not_going_count=Sum(
Case(
When(participant__registration__is_going=False, then=1),
default=0,
output_field=models.IntegerField())
)
))
This is a standard aggregate, which will be used to SUM up the count of is_going=False. This translates into the SQL
SUM(CASE WHEN past."is_going" = False THEN 1 ELSE 0 END)
The sum is generated for each registration model, and the sum belongs to the registration's participant.

I might misunderstood, but you can do for single participant:
participant = Participant.objects.get(id=1)
not_going_count = Registration.objects.filter(participant=participant,
is_going=False).count()
For all participants,
from django.db.models import Count
Registration.objects.filter(is_going=False).values('participant') \
.annotate(not_going_num=Count('participant'))
Django doc about aggregating for each item in a queryset.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django - Unexpected behaviour with default ordering after annotations - django

Related

Django ORM Need help speeding up query, connected to additional tables

Django QuertySet.annotate() received non-expression - how to add a derived field based on model field?

Annotations in django with model managers

Ignore null values in descending order using Django Rest Framework

Adding aggregate over filtered self-join field to Admin list_display

Categories

Resources