I have two tables:
Ticket Table
id paid_with_tax location
1 5 A
2 6 B
3 7 B
TicketAdjustment Table
id ticket_id value_with_tax
1 1 2
2 1 1
3 1 2
4 1 3
5 2 5
The query I use:
Ticket.objects.all().annotate(
paid_amount=Sum(
F('paid_with_tax') +
Coalesce(F('ticketadjustment__value_with_tax'), 0)
)
).values(
'paid_amount', 'location'
).annotate(
Count('id)
)
the query would return the following:
[
{
id__count: 6,
paid_amount__sum: 28,
location: A
},
{
id__count: 2,
paid_amount__sum: 18,
location: B
},
]
but the above is incorrect since the Ticket Table id=1 values are duplicated by the TicketAdjustment Table values.
how can i get the query to sum the TicketAdjustment Table values before adding them up.
Some constraints:
- the order of the calls would ideally stay the same, as I have a function which returns the queryset to be filtered thurther
The final result should look as followins:
[
{
id__count: 1,
paid_amount__sum: 13,
location: A
},
{
id__count: 2,
paid_amount__sum: 18,
location: B
},
]
models.py:
class Ticket(models.Model):
paid_with_tax = models.DecimalField(max_digits=6, decimal_places=4)
location = models.ForeignKey(Location)
class TicketAdjustment(models.Model):
value_with_tax = models.DecimalField(max_digits=6, decimal_places=4)
ticket = models.ForeignKey(Ticket)
As I understand this aggregation can not be done even using raw sql query. Because the result of joining Ticket with TicketAdjustment will be like:
So, we can't just sum all value_with_tax related to some ticket with paid_with_tax and group that by location.
I couldn't find solution to perform this in one sql query. But I have found how to perform this in two queries:
tickets = Ticket.objects.values(
'location',
).annotate(
count=Count('id', distinct=True),
paid_amount=Sum('paid_with_tax')
)
# <QuerySet [
# {'paid_amount': 5, 'count': 1, 'location': 'A'},
# {'paid_amount': 13, 'count': 2, 'location': 'B'}
# ]>
adjustments = TicketAdjustment.objects.annotate(
location=F('ticket__location'))
.values(
'location',
).annotate(
paid_amount=Sum('value_with_tax')
)
# <QuerySet [
# {'paid_amount': 5, 'location': 'B'},
# {'paid_amount': 8, 'location': 'A'}
# ]>
def find_paid_amount_for_list_and_location(l, location):
for item in l:
if item['location'] == location:
return item['paid_amount'] or 0
for obj in tickets:
paid = find_paid_amount_for_list_and_location(adjustments, obj['location'])
obj['paid_amount'] += paid
tickets
# [
# {'location': 'A', 'paid_amount': 13, 'count': 1},
# {'location': 'B', 'paid_amount': 18, 'count': 2}
# ]
But this is not super efficient solution, and I think that you should create new table for locations and just have FK to that table? In this case you will able to perform these calculations in one query on db side.
Related
I've a model name Points which store the user points on the base of it's actions.
class Points(CreateUpdateModelMixin):
class Action(models.TextChoices):
BASE = 'BASE', _('Base')
REVIEW = 'REVIEW', _('Review')
FOLLOW = 'FOLLOW', _('Follow')
VERIFIED_REVIEW = 'VERIFIED_REVIEW', _('Verified Review')
REFERRAL = 'REFERRAL', _('Referral')
ADD = 'ADD', _('Add')
SUBTRACT = 'SUBTRACT', _('Subtract')
user = models.ForeignKey(User, on_delete=models.CASCADE)
points = models.IntegerField()
action = models.CharField(max_length=64, choices=Action.choices, default=Action.BASE)
class Meta:
db_table = "diner_points"
Please note that there are multiple rows for the same user.
For the past few days I'm trying to write a query to get the total_points of the use and also the rank of that user.
Using:
Django 3.2
MySQL 5.7
I want to know input of you guys. Thanks.
I wrote this query and many other like it. But none of them give the results I want.
Let's suppose the data is something like this.
user
points
771
221
1083
160
1083
12
1083
10
771
-15
1083
4
1083
-10
124
0
23
1771
The current query I have written is this...
innerquery = (
DinerPoint.objects
.values("user")
.annotate(total=Sum("points"))
.distinct()
)
query = (
DinerPoint.objects
.annotate(
total = Subquery(
innerquery.filter(user=OuterRef("user")).values("total")
),
rank = Subquery(
DinerPoint.objects
.annotate(
total = Subquery(
innerquery.filter(user=OuterRef("user")).values("total")
),
rank=Func(F("user"), function="Count")
)
.filter(
Q(total__gt=OuterRef("total")) |
Q(total=OuterRef("total"), user__lt=OuterRef("user"))
)
.values("rank")[:1]
)
)
)
query.values('user', 'total', 'rank').distinct().order_by('rank')
But this give the results like this
<QuerySet [
{'user': 23, 'total': 1771, 'rank': 1},
{'user': 1083, 'total': 176, 'rank': 2},
{'user': 771, 'total': 106, 'rank': 8}, <---- Issue beacuse of dups entries
{'user': 124, 'total': 0, 'rank': 9}
]>
I've tried RANK, DENSE RANK and didn't got the results I wanted.
The only way I got the results I wanted I throught the Common Table Expression(CTE). But unfortunately I can't use that because of mysql version 5.7 in produciton.
P.S I'm using the count and greater than beacause of my use case. I have a use case where we have to get rank in the user friends.
The working code using CTE by django_cte (You can ignore this beacuse of mysql 5.7 ;) )
def get_queryset(user=None, following=False):
if not user:
user = User.objects.get(username="king")
innerquery = (
DinerPoint.objects
.values("user", "user__username", "user__first_name", "user__last_name", "user__profile_fixed_url",
"user__is_influencer", "user__is_verified", "user__instagram_handle")
.annotate(total=Sum("points"))
.distinct()
)
if following:
innerquery = innerquery.filter(Q(user__in=Subquery(user.friends.values('id'))) |
Q(user = user))
basequery = With(innerquery)
subquery = (
basequery.queryset()
.filter(Q(total__gt=OuterRef("total")) |
Q(total=OuterRef("total"), user__lt=OuterRef("user")))
.annotate(rank=Func(F("user"), function="Count"))
.values("rank")
.with_cte(basequery)
)
query = (
basequery.queryset()
.annotate(rank=Subquery(subquery) + 1)
.select_related("user")
.with_cte(basequery)
)
return query
I have done this using the Func expression field. The final query which works for me is attached below in case you are looking for an answer.
rank=Func(
F("user"),
function="Count",
template="%(function)s(DISTINCT %(expressions)s)",
),
Final query
def get_queryset(self):
following = self.request.query_params.get("following", False)
innerquery = (
DinerPoint.objects.values("user").annotate(total=Sum("points")).distinct()
)
basequery = DinerPoint.objects
if following:
innerquery = innerquery.filter(
Q(user__in=Subquery(self.request.user.friends.values("id")))
| Q(user=self.request.user)
)
basequery = basequery.filter(
Q(user__in=Subquery(self.request.user.friends.values("id")))
| Q(user=self.request.user)
)
query = (
basequery.annotate(
total=Subquery(
innerquery.filter(user=OuterRef("user")).values("total")
),
rank=Subquery(
DinerPoint.objects.annotate(
total=Subquery(
innerquery.filter(user=OuterRef("user")).values("total")
),
rank=Func(
F("user"),
function="Count",
template="%(function)s(DISTINCT %(expressions)s)",
),
)
.filter(
Q(total__gt=OuterRef("total"))
| Q(total=OuterRef("total"), user__lt=OuterRef("user"))
)
.values("rank")
)
+ 1,
)
.values(
"user",
"user__username",
"user__first_name",
"user__last_name",
"user__profile_fixed_url",
"user__is_influencer",
"user__is_verified",
"user__instagram_handle",
"total",
"rank",
)
.distinct()
)
return query
Hi stackoverflow community, my question is about django annotate.
Basically what I am trying to do is to find duplicated value with same values from two different fields in two different tables.
This is my models.py
class Order(models.Model):
id_order = models.AutoField(primary_key=True)
class OrderDelivery(models.Model):
order = models.ForeignKey(Order, on_delete=models.SET_NULL, null=True, blank=True)
delivery_address = models.TextField()
class OrderPickup(models.Model):
order = models.ForeignKey(Order, on_delete=models.SET_NULL, null=True, blank=True)
pickup_date = models.DateField(blank=True, null=True)
This is my current code:
dup_job = Order.objects.filter(
orderpickup__pickup_date__range=(start_date, end_date)
).values(
'orderdelivery__delivery_address',
'orderpickup__pickup_date',
).annotate(
duplicated=Count('orderdelivery__delivery_address')
).filter(
duplicated__gt=1
)
Based on what I have, I am getting result like this (delivery_address is omitted for privacy purpose):
{'orderdelivery__delivery_address': '118A', 'orderpickup__pickup_date': datetime.date(2022, 3, 9), 'duplicated': 2}
{'orderdelivery__delivery_address': '11', 'orderpickup__pickup_date': datetime.date(2022, 3, 2), 'duplicated': 6}
{'orderdelivery__delivery_address': '11 A ', 'orderpickup__pickup_date': datetime.date(2022, 3, 3), 'duplicated': 5}
{'orderdelivery__delivery_address': '21', 'orderpickup__pickup_date': datetime.date(2022, 3, 10), 'duplicated': 3}
{'orderdelivery__delivery_address': '642', 'orderpickup__pickup_date': datetime.date(2022, 3, 7), 'duplicated': 2}
{'orderdelivery__delivery_address': '642', 'orderpickup__pickup_date': datetime.date(2022, 3, 8), 'duplicated': 2}
{'orderdelivery__delivery_address': 'N/A,5', 'orderpickup__pickup_date': datetime.date(2022, 3, 8), 'duplicated': 19}
Is there a way to get the id_order of those 'duplicated'?
I have tried include id_order in .values() but the output will not be accurate as the annotation is grouping by the id_order instead of delivery_address.
Thank you in advance
You can get the smallest (or largest) item with a Min [Django-doc] (or Max) aggregate:
from django.db.models import Min
dup_job = Order.objects.filter(
orderpickup__pickup_date__range=(start_date, end_date)
).values(
'orderdelivery__delivery_address',
'orderpickup__pickup_date',
).annotate(
min_id_order=Min('id_order')
duplicated=Count('orderdelivery__delivery_address')
).filter(
duplicated__gt=1
)
or for postgresql, you can make use of the ArrayAgg [Django-doc] to generate a list:
# PostgreSQL only
from django.contrib.postgres.aggregates import ArrayAgg
dup_job = Order.objects.filter(
orderpickup__pickup_date__range=(start_date, end_date)
).values(
'orderdelivery__delivery_address',
'orderpickup__pickup_date',
).annotate(
min_id_order=ArrayAgg('id_order')
duplicated=Count('orderdelivery__delivery_address')
).filter(
duplicated__gt=1
)
def get_queryset(self):
date = datetime.date.today()
current_year = date.today().year
queryset = Subscription.objects.filter(created__year=current_year)\
.exclude(user__is_staff=True).values(
month=TruncMonth('created')
).annotate(
total_members=Count('created')
).order_by('month')
return queryset
This is my group by function, how I can get zero if no value is present for a particular month
{
"count": 2,
"next": null,
"previous": null,
"results": [
{
"total_members": 9,
"month": "2021-02-01T00:00:00Z"
},
{
"total_members": 3,
"month": "2021-03-01T00:00:00Z"
}
]
}
This is the output I am getting now, As there is no value for January, there is no output, what I wanted is zero for the month of January
Probably you can use Coalesce for this, for example:
from django.db.models import Value as V
from django.db.models.functions import Coalesce
def get_queryset(self):
date = datetime.date.today()
current_year = date.today().year
queryset = Subscription.objects.filter(created__year=current_year)\
.exclude(user__is_staff=True).values(
month=TruncMonth('created')
).annotate(
total_members=Coalesec(Count('created'), V(0))
).order_by('month')
return queryset
I am trying to query and the group is the Order of the last 6 months.
and this is my models:
class Order(models.Model):
created_on = models.DateTimeField(_("Created On"), auto_now_add=True)
and this is my method to parse month:
from django.db.models import Func
class Month(Func):
"""
Method to extract month
"""
function = 'EXTRACT'
template = '%(function)s(MONTH from %(expressions)s)'
output_field = models.IntegerField()
And this is my query:
current_date = date.today()
months_ago = 6
six_month_previous_date = current_date - timedelta(days=(months_ago * 365 / 12))
order = Order.objects.filter(
created_on__gte=six_month_previous_date,
).annotate(
month=Month('created_on')
).values(
'month'
).annotate(
count=Count('id')
).values(
'month',
'count'
).order_by(
'month'
)
In my database order table, there is only on entry:
So it is returning
[{'month': 10, 'count': 1}]
But i dont want like this, i want like these of last 6 month, if in one month, there is no sales, it should return the count: 0
Like thise bellow:
[
{'month': 10, 'count': 1},
{'month': 9, 'count': 0}
{'month': 8, 'count': 0}
{'month': 7, 'count': 0}
{'month': 6, 'count': 0}
{'month': 5, 'count': 0}
]
A database works under the closed world assumption, so it will not insert rows with 0. You can however post-process the list.
from django.utils.timezone import now
order = Order.objects.filter(
created_on__gte=six_month_previous_date,
).values(
month=Month('created_on')
).annotate(
count=Count('id')
).order_by('month')
order = {r['month']: r['count'] for r in order}
month = now().month
result = [
{'month': (m % 12)+1, 'count': order.get((m % 12) + 1, 0)}
for m in range(month-1, month-8, -1)
]
Note that Django already has an ExtractMonth function [Django-doc].
I have a models like this:
class Subscription(models.Model):
visable_name = models.CharField(max_length=50, unique=True)
recipe_name = models.CharField(max_length=50)
website_url = models.URLField()
class User(models.Model):
username = models.CharField(max_length=50)
class UserSubs(models.Model):
subscription = models.ForeignKey(Subscription, to_field='visable_name')
user = models.ForeignKey(User, to_field='username')
And I want to prepare simple ranking, so I came up with something like this:
rank = UserSubs.objects.values('subscription').
annotate(total=Count('user')).
order_by('-total')`
what gives:
>> rank
[
{'total': 3, 'subscription': u'onet'},
{'total': 2, 'subscription': u'niebezpiecznik'},
{'total': 1, 'subscription': u'gazeta'}
]
what I need, is similar list of full objects:
[
{'total': 3, 'subscription': <Subscription: onet>},
{'total': 2, 'subscription': <Subscription: niebezpiecznik>},
{'total': 1, 'subscription': <Subscription: gazeta>}
]
I am not sure, whether 'select_related' will be helpful here, but I can't figured out how to use it :(
Better to build your query from Subscription because you need it:
Subscription.objects.annotate(total=models.Count('usersubs')).order_by('-total')
Maybe you can use dict and list comprehension, and filter this as simple python objects:
d = {sub.visable_name: sub for sub in Subscriptions.objects.all()}
new_rank = [
{
'total': row['total'],
'subscriptions': d[row['subscriptions']]
}
for row in rank
]
what will give:
>> new_rank
[
{'total': 3, 'subscriptions': <Subscriptions: onet>},
{'total': 2, 'subscriptions': <Subscriptions: niebezpiecznik>},
{'total': 1, 'subscriptions': <Subscriptions: gazeta.pl>}
]