Need help forming queryset with multiple aggregates

Need help forming queryset with multiple aggregates - django

I have the following model defined:
class TestCaseResult(models.Model):
run_result = models.ForeignKey(
RunResult,
on_delete=models.CASCADE,
)
name = models.CharField(max_length=128)
duration = models.DurationField(default=datetime.timedelta)
result = models.CharField(
max_length=1,
choices=(('f', 'failure'), ('s', 'skipped'), ('p', 'passed'), ('e', 'error')),
)
I'm trying to get, in a single query, the count of each kind of result for a given run_result, along with the sum of the durations for the test cases with that result.
This gives me the count of each type of result, but I can't figure out how to get the sum of the durations included.
qs = TestCaseResult.objects.filter(run_result=run_result).values('result').annotate(result_count=Count('result'))
I basically want this as the resulting SQL:
SELECT
"api2_testcaseresult"."result",
SUM("api2_testcaseresult"."duration") AS "duration",
COUNT("api2_testcaseresult"."result") AS "result_count"
FROM "api2_testcaseresult"
WHERE "api2_testcaseresult"."run_result_id" = 3
GROUP BY "api2_testcaseresult"."result";
Note how 'duration' is not part of the 'group by' clause.

You can simply append a second annotate():
qs = (TestCaseResult.objects
.filter(run_result=run_result)
.values('result')
.annotate(result_count=Count('result'))
.annotate(result_duration=Sum('duration'))
)
This should give you exactly the desired SQL query.

Related

Annotate based on related field, with filters

class IncomeStream(models.Model):
product = models.ForeignKey(Product, related_name="income_streams")
from_date = models.DateTimeField(blank=True, null=True)
to_date = models.DateTimeField(blank=True, null=True)
value = MoneyField(max_digits=14, decimal_places=2, default_currency='USD')
class Product(models.Model):
...
class Sale(models.Model):
product = models.ForeignKey(Product, related_name="sales")
created_at = models.DateTimeField(auto_now_add=True)
...
With the above model, suppose I want to add a value to some Sales using .annotate.
This value is called cpa (cost per action): cpa is the value of the IncomeStream whose from_date and to_date include the Sale created_at in their range.
Furthermore, from_date and to_date are both nullable, in which case we assume they mean infinity.
For example:
<IncomeStream: from 2021-10-10 to NULL, value 10$, product TEST>
<IncomeStream: from NULL to 2021-10-09, value 5$, product TEST>
<Sale: product TEST, created_at 2019-01-01, [cpa should be 5$]>
<Sale: product TEST, created_at 2021-11-01, [cpa should be 10$]>
My question is: is it possible to write all these conditions using only the Django ORM and annotate? If yes, how?
I know F objects can traverse relationships like this:
Sale.objects.annotate(cpa=F('product__income_streams__value'))
But then where exactly can I write all the logic to determine which specific income_stream it should pick the value from?
Please suppose no income stream have overlapping dates for the same product, so the above mentioned specs never result in conflicts.

Something like this should get you started
subquery = (
IncomeStream
.objects
.values('product') # group by product primary key i.e. product_id
.filter(product=OuterRef('product'))
.filter(from_date__gte=OuterRef('created_at'))
.filter(to_date__lte=OuterRef('created_at'))
.annotate(total_value=Sum('value'))
)
Then with the subquery
Sale
.objects
.annotate(
cpa=Subquery(
subquery.values('total_value')
) # subquery should return only one row so
# so just need the total_value column
)
Without the opportunity to play around with this in the shell myself I not 100%. It should be close though anyway.

Filtering X most recent entries in each category of queryset

Question is regarding filtering X most recent entries in each category of queryset.
Goal is like this:
I have a incoming queryset based on the following model.
class UserStatusChoices(models.TextChoices):
CREATOR = 'CREATOR'
SLAVE = 'SLAVE'
MASTER = 'MASTER'
FRIEND = 'FRIEND'
ADMIN = 'ADMIN'
LEGACY = 'LEGACY'
class OperationTypeChoices(models.TextChoices):
CREATE = 'CREATE'
UPDATE = 'UPDATE'
DELETE = 'DELETE'
class EntriesChangeLog(models.Model):
content_type = models.ForeignKey(
ContentType,
on_delete=models.CASCADE,
)
object_id = models.PositiveIntegerField(
)
content_object = GenericForeignKey(
'content_type',
'object_id',
)
user = models.ForeignKey(
get_user_model(),
verbose_name='user',
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name='access_logs',
)
access_time = models.DateTimeField(
verbose_name='access_time',
auto_now_add=True,
)
as_who = models.CharField(
verbose_name='Status of the accessed user.',
choices=UserStatusChoices.choices,
max_length=7,
)
operation_type = models.CharField(
verbose_name='Type of the access operation.',
choices=OperationTypeChoices.choices,
max_length=6,
)
And I need to filter this incoming queryset in a such way to keep only 4 most recent objects (defined by access_time field) in each category. Categories are defined by ‘content_type_id’ field and there are 3 possible options.
Lets call it ‘option1’, ‘option2’ and ‘option3’
This incoming queryset might contain different amount of objects of 1,2 or all 3 categories. This is can’t be predicted beforehand.
DISTINCT is not possible to use as after filtering operation this queryset might be ordered.
I managed to get 1 most recent object in a following way:
# get one most recent operation in each category
last_operation_time = Subquery(
EntriesChangeLog.objects.filter(user=OuterRef('user')).values('content_type_id').
annotate(last_access_time=Max(‘access_time’)).values_list('last_access_time', flat=True)
)
queryset.filter(access_time__in=last_operation_time)
But I have a hard time to figure out how to get last 4 most recent objects instead of last one.
This is needed for Django-Filter and need to be done in one query.
DB-Postgres 12
Do you have any ideas how to do such filtration?
Thanks...

pk_to_rank = queryset.annotate(rank=Window(
expression=DenseRank(),
partition_by=('content_type_id',),
order_by=F('access_time').desc(),
)).values_list('pk', 'rank', named=True)
pks_list = sorted(log.pk for log in pk_to_rank if log.rank <= value)
return queryset.filter(pk__in=pks_list)
Managed to do it only this way by spliting queryset in 2 parts. Option with 3 unions is also possible but what if we have 800 options instead 3 - make 800 unions()??? ges not...

Annotation to count and return zero when there is no relation

Given the following relation:
class LicenseRequest:
license_type = models.ForeignKey(LicenseType)
created_at = models.DateField(default=now, editable=False)
class LicenseType:
name = models.CharField(max_length=100)
value = models.CharField(max_length=3, unique=True)
I want to count how many requests have been created for each license type. However, since I am generating a graphic, I must include 0 (zero) for license types without any license request in that specific period.
I tried to do what was suggested here, but it did not work. I can only get the count from License Types which have more than one license request.
qs = LicenseType.objects.filter(
Q(licenserequest__created_at__range=(start_date, end_date)) | Q(licenserequest__isnull=True)
).annotate(rel_count=Count('licenserequest__id'))
I could find another way to achieve this goal, but I was wondering if I can do it through annotation.
I am using django1.11.15.

In django-2.0 and higher, the Count object has a filter parameter, so we can specify the conditions for this:
qs = LicenseType.objects.annotate(
rel_count=Count(
'licenserequest',
filter=Q(licenserequest__created_at__range=(start_date, end_date))
)
)
For django-1.11 and below, we can use the Sum(..) of a Case(..) expression:
qs = LicenseType.objects.annotate(
rel_count=Sum(Case(
When(
licenserequest__created_at__range=(start_date, end_date),
then=1
),
default=0,
output_field=IntegerField()
))
)

qs = LicenseType.objects.annotate(count=Count('licenserequest__id')
condition = Q(licenserequest__created_at__range=(start_date, end_date)) & Q(licenserequest__isnull=True)
qs = qs.annotate(Case(When(condition, then=F('count')), default=0, output_field=IntegerField())
This should work for the model description that you have provided.
To do the later filter, you cannot use a direct .filter() but rather use a Case/When .annotate() clause

Django combine filter on two fields

I am relatively new to Django. I'm having problem when filtering data. I have two models, given below:
Account(models.Model):
name = models.CharField(max_length=60)
hotel = models.ForeignKey(Hotel)
account_type = models.CharField(choices=ACCOUNT_TYPE, max_length=30)
Transaction(models.Model):
account = models.ForeignKey(Account, related_name='transaction')
transaction_type = models.CharField(choices=TRANSACTION_TYPE, max_length=15)
description = models.CharField(max_length=100, blank=True, null=True)
date_created = models.DateTimeField(default=timezone.now)
ACCOUT_TYPE is:
ACCOUNT_TYPE = (
(0, 'Asset'),
(1, 'Liabilities'),
(2, 'Equity'),
(3, 'Income'),
(4, 'Expense')
)
I want to filter all the transactions where the account type is Income and Expense within a given date range. How can I combine those filters in Django?
I have tried like this:
income_account = Account.objects.filter(account_type=3)
expense_account = Account.objects.filter(account_type=4)
transactions = Transaction.objects.filter(Q(
account=income_account,
date_created__gte=request.data['start_date'],
date_created__lte=request.data['end_date']
) & Q(
account=expense_account,
date_created__gte=request.data['start_date'],
date_created__lte=request.data['end_date'])).order_by('date_created')
But it's not working. It raises the following error:
ProgrammingError: more than one row returned by a subquery used as an expression

income_account and expense_account is not single object, it is a list of objects. So instead of this account=income_account and this account=expense_account try to use in: account__in=income_account and account__in=expense_account.
Also you probably could simplify queryset like this:
accounts = Account.objects.filter(Q(account_type=3) | Q(account_type=4))
transactions = Transaction.objects.filter(
account__in=accounts,
date_created__gte=request.data['start_date'],
date_created__lte=request.data['end_date']
).order_by('date_created')

Instead of having multiple querysets, you can have only one, as Q allows ORing of filters. You could do:
Transaction.objects.filter(
(Q(account__account_type=3) | Q(account__account_type=4)) &
Q(date_created__range=[start_date, end_date])
)
The __range can be used to get dates between the specified start_date and end_date.

You can always use in to lookup records by multiple values. So, if you want Transaction where ACCOUNT_TYPE are Income, Expenseyou can use it like this.
Transaction.objects.filter(Q(account__in=[3,4]) & Q(date_created__gte=request.data['start_date']) & Q(date_created__lte=request.data['end_date'])).order_by('date_created')

This will work for you:-
result = Account.objects.filter((account_type__in['Income','Expense'])
OR
result = Account.objects.filter((account_type__in['0','4'])
I have put 0 and 4 as string because you have mention account_type as CharField.

Django query with annotated conditional expression uses INNER JOIN. How do I get it to use OUTER JOIN?

I have a "Meal" model with a foreign key to "Food". Each meal has a rating: good, bad, or indifferent. I want to query a list of all foods and annotate the count of each type of meal rating, but some foods have no meals yet, so I want the query to use a LEFT OUTER JOIN and in that case the counts should be zero.
I am using Conditional Expressions in Django 1.8, and it always switches the relationship to an INNER JOIN between "Food" and "Meal". For example:
Meal model:
class Meal(models.Model):
GOOD = 1
BAD = 2
INDIFFERENT = 3
RATING_CHOICES = (
(GOOD, 'Good'),
(BAD, 'Bad'),
(INDIFFERENT, 'Indifferent')
)
meal_time = models.DateTimeField()
food = models.ForeignKey("Food")
rating = models.IntegerField(blank=True, null=True, choices=RATING_CHOICES)
When I query Food.objects.annotate(total_meals=Count('meal')), Django generates a query like
SELECT ... FROM "Food"
LEFT OUTER JOIN "Meal" ON ...
GROUP BY "Food"
However, when I add these conditional annotations:
class FoodQuerySet(models.QuerySet):
def with_meal_rating_frequency(self):
return self.annotate(
total_meals=Count('meal'),
good_meals=Sum(
Case(When(meal__rating=Meal.GOOD, then=1),
output_field=models.IntegerField(), default=0)
),
bad_meals=Sum(
Case(When(meal__rating=Meal.BAD, then=1),
output_field=models.IntegerField(), default=0)
),
indifferent_meals=Sum(
Case(When(meal__rating=Meal.INDIFFERENT, then=1),
output_field=models.IntegerField(), default=0)
)
)
Django uses and INNER JOIN instead.
SELECT ... FROM "Food"
INNER JOIN "Meal" ON ...
GROUP BY "Food"
I know this question is very similar to this one but Its not clear to me how to apply the accepted solution to my case. How can I get Django to use a LEFT OUTER JOIN? Your help is appreciated, thanks!

I have found a solution that seems to be working so far, using Count() instead of Sum() and having the conditions check for NULL meals, which won't be included in the count:
class FoodQuerySet(models.QuerySet):
def with_meal_rating_frequency(self):
return self.annotate(
total_meals=Count('meal'),
good_meals=Count(
Case(When(Q(meal__isnull=True) | Q(meal__rating=Meal.GOOD), then='meal__rating'),
output_field=models.IntegerField(), default=None)
),
bad_meals=Count(
Case(When(Q(meal__isnull=True) | Q(meal__rating=Meal.BAD), then='meal__rating'),
output_field=models.IntegerField(), default=None)
),
indifferent_meals=Count(
Case(When(Q(meal__isnull=True) | Q(meal__rating=Meal.INDIFFERENT), then='meal__rating'),
output_field=models.IntegerField(), default=None)
)
)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Need help forming queryset with multiple aggregates - django

You can simply append a second annotate(): qs = (TestCaseResult.objects .filter(run_result=run_result) .values('result') .annotate(result_count=Count('result')) .annotate(result_duration=Sum('duration')) ) This should give you exactly the desired SQL query.

Related

Annotate based on related field, with filters

Filtering X most recent entries in each category of queryset

Annotation to count and return zero when there is no relation

Django combine filter on two fields

Django query with annotated conditional expression uses INNER JOIN. How do I get it to use OUTER JOIN?

Categories

Resources