I maintain a Django service that allows online community moderators to review/approve/reject user posts. Right now we measure the average "time to approval" but we need to start measuring the 90th percentile "time to approval" instead. So where we used to say "on average content gets approved in 3.3 hours", we might now say something like "90% of content is approved in 4.2 hours or less".
# Models.py
class Moderation(models.Model):
content = models.TextField()
created_at = models.DateTimeField(auto_now_add=True)
message_id = models.TextField(blank=True, null=True)
class ModerationAction(models.Model):
moderation = models.ForeignKey(Moderation)
action = models.CharField(max_length=50)
created_at = models.DateTimeField(auto_now_add=True)
# stats.py
average_time_to_approve_7days = ModerationAction.objects.filter(
action__in=moderation_actions,
created_at__gte=timezone.now() - timedelta(days=7)
).annotate(
time_to_approve=F('created_at') - F('moderation__created_at')
).values(
'action',
'time_to_approve'
).aggregate(
Avg('time_to_approve')
)['time_to_approve__avg']
# This returns a value like datetime.timedelta(0, 4008, 798824)
My goal: I'm seeking a way to get the 90th percentile time rather than the average time.
Related
I have three models:
class Ticket(models.Model):
date = models.DateTimeField(default=datetime.datetime.now, blank=True)
subject = models.CharField(max_length=256)
description = models.TextField()
class Comments(models.Model):
date = models.DateTimeField(default=datetime.datetime.now, blank=True)
comment = models.TextField()
action = models.ForeignKey(Label, on_delete=models.CASCADE, limit_choices_to={'group__name': 'action'}, related_name='action_label')
ticket = models.ForeignKey(Ticket, on_delete=models.CASCADE)
class Label(models.Model):
name = models.CharField(max_length=50)
group = models.ForeignKey(LabelGroup, on_delete=models.CASCADE)
My goal is to get an average timespan between Ticket.date (opened) and the latest Comment.date of type Label.name = containment. Lastly, I want to pass in a date range (start_date, end_date) to limit to only tickets within that time period.
Ultimately, I want to return:
{
avg_time_to_contain: 37,
avg_time_to_recover: 157
}
from a DRF view.
The query set I have so far is:
queryset = Comments.objects.filter(ticket__date__gte=start_date, ticket__date__lte=end_date).filter(action__name__icontains="containment").distinct(ticket).aggregate(avg_containment=Avg(F(date)- F(ticket__date)))
I am getting an error, saying incident is not defined. I believe that is due to my distinct function but I am having a hard time getting my brain to translate into what I am after.
Pseudo code for the query (in my thinking),
Grab comments that have tickets (FK) with dates between start / end date
Filter by the action (FK) name having the work containment
Just grab one distinct ticket at a time (feel like I am not doing this right, basically I want to just have the last entry for each ticket to be able to pull the diff between open & last containment comment, and then average the difference in time)
Get the average of the containment date comment - the open date.
ERROR MSG:
NameError at /api/timetocontain/
name 'ticket' is not defined
Request Method: GET Request URL: http://127.0.0.1/api/timetocontain/
Django Version: 4.1.3 Exception Type: NameError Exception Value:
name 'ticket' is not defined
class IncomeStream(models.Model):
product = models.ForeignKey(Product, related_name="income_streams")
from_date = models.DateTimeField(blank=True, null=True)
to_date = models.DateTimeField(blank=True, null=True)
value = MoneyField(max_digits=14, decimal_places=2, default_currency='USD')
class Product(models.Model):
...
class Sale(models.Model):
product = models.ForeignKey(Product, related_name="sales")
created_at = models.DateTimeField(auto_now_add=True)
...
With the above model, suppose I want to add a value to some Sales using .annotate.
This value is called cpa (cost per action): cpa is the value of the IncomeStream whose from_date and to_date include the Sale created_at in their range.
Furthermore, from_date and to_date are both nullable, in which case we assume they mean infinity.
For example:
<IncomeStream: from 2021-10-10 to NULL, value 10$, product TEST>
<IncomeStream: from NULL to 2021-10-09, value 5$, product TEST>
<Sale: product TEST, created_at 2019-01-01, [cpa should be 5$]>
<Sale: product TEST, created_at 2021-11-01, [cpa should be 10$]>
My question is: is it possible to write all these conditions using only the Django ORM and annotate? If yes, how?
I know F objects can traverse relationships like this:
Sale.objects.annotate(cpa=F('product__income_streams__value'))
But then where exactly can I write all the logic to determine which specific income_stream it should pick the value from?
Please suppose no income stream have overlapping dates for the same product, so the above mentioned specs never result in conflicts.
Something like this should get you started
subquery = (
IncomeStream
.objects
.values('product') # group by product primary key i.e. product_id
.filter(product=OuterRef('product'))
.filter(from_date__gte=OuterRef('created_at'))
.filter(to_date__lte=OuterRef('created_at'))
.annotate(total_value=Sum('value'))
)
Then with the subquery
Sale
.objects
.annotate(
cpa=Subquery(
subquery.values('total_value')
) # subquery should return only one row so
# so just need the total_value column
)
Without the opportunity to play around with this in the shell myself I not 100%. It should be close though anyway.
In my django project I am trying to repost a record based on its frequency of recurrence which could be daily, weekly, monthly etc. Let's say a record is posted today and it's frequency of recurrence is weekly, I want the record to keep reappearing on a weekly basis on the 'day' and 'time' as the previous week when it was created and so on, that is, an old record will be new and top of other records which will now be older than it based on frequency of recurrence.
models.py
class Menu(models.Model):
none = 0
Daily = 1
Weekly = 7
Monthly = 30
Quarterly = 90
SemiAnual = 180
Yearly = 365
Frequency_Of_Reocurrence = (
(none, "None"),
(Daily, "Daily"),
(Weekly, "Weekly"),
(Monthly, "Monthly"),
(Quarterly, "After 3 Months"),
(SemiAnual, "After 6 Months"),
(Yearly, "After 12 Months")
)
vendor = models.ForeignKey(Vendor, on_delete=models.CASCADE)
name = models.CharField(verbose_name="food name", null=False, blank=False, max_length=100)
description = models.TextField(verbose_name="Food Description", max_length=350, null=False, blank=False)
image = models.ImageField(upload_to=user_directory_path, default='veggies.jpg')
isrecurring = models.BooleanField(default=False)
frequencyofreocurrence = models.IntegerField(choices=Frequency_Of_Recurrence)
datetimecreated = models.DateTimeField(verbose_name='date-time-created', auto_now_add=True)
How exactly can I achieve what I'm trying to do in my views.
Thanks in advance.
You want to set up a cron job to check on a daily basis, have conditions to determine if a particular record is due for repost then repost that (update the time stamp if you’re not looking to create a new record). I don’t write Django so I can’t give specifics.
a little bit complication to solve it:
from django.db.models import IntegerField, F, ExpressionWrapper
from django.db.models.functions import TruncDate, Mod, Now
qset = Menu.objects.exclude(Frequency_Of_Reocurrence = 0)
.annotate(duration = ExpressionWrapper(
(TruncDate(Now())-TruncDate('datetimecreated'))/F('Frequency_Of_Reocurrence'),
output_field = IntegerField())
.annotate(reorder = ExpressionWrapper(Mod('duration', 10**8), output_field=IntegerField()))
.filter(reorder = 0).order_by('datetimecreated').values()
maybe you can refine the code.
I am using Django 1.6.
My Model looks something like:
Class Transaction(models.Model):
type = models.CharField(max_length=255, db_index=True)
amount = models.DecimalField(decimal_places=2, max_digits=10, default=0.00)
I have few transactions, few of which are credit and other are debit (determined by type column). I need to check the balance of all transaction i.e., (debit - credit)
Currently, I could do that using 2 queries as below:
debit_amount=Transaction.objects.fitler(type='D').aggregate(debit_amount=Sum('amount'))['debit_amount']
credit_amount=Transaction.objects.fitler(type='C').aggregate(credit_amount=Sum('amount'))['credit_amount']
balance = debit_amount - credit_amount
I am looking something like:
Transaction.objects.aggregate(credit=Sum('amount', filter=Q(type='C')), debit=Sum('amount', filter=Q(type='D')))
You can use conditional expression
from django.db.models import *
result = Transaction.objects.aggregate(
credit=Sum(Case(
When(Q(tye='C'), then=F('amount')),
output_field=IntegerField(),
default=0
)),
debit=Sum(Case(
When(Q(tye='D'), then=F('amount')),
output_field=IntegerField(),
default=0
)),
)
balance = result['debit'] - result['credit']
This should be possible in django 2.0 (https://docs.djangoproject.com/en/2.0/ref/models/conditional-expressions/#case)
totals = Transaction.objects.aggregate(
credit=Sum('amount', filter=Q(type='C')),
debit=Sum('amount', filter=Q(type='D'))
)
total = totals.credits - totals.debit
I have my payment model I what to be able to select by date
class LeasePayment(CommonInfo):
version = IntegerVersionField( )
amount = models.DecimalField(max_digits=7, decimal_places=2)
lease = models.ForeignKey(Lease)
leaseterm = models.ForeignKey(LeaseTerm)
payment_date = models.DateTimeField()
method = models.CharField(max_length=2, default='Ch',
choices=PAYMENT_METHOD_CHOICES)
Basically I want to be able to input 2 dates and display all the data between them . Righ now I started to implement this solution https://groups.google.com/forum/#!topic/django-filter/lbi_B4zYq4M based on django_filter. However since the task is pretty trivial was wondering if there an easier way.
Try to use date__range it will return data from database in selected date interval:
LeasePayment.objects.filter(payment_date__range=[start_date, end_date])