I have a puzzle on my hands. As an exercise, I am trying to write a queryset that helps me visualize which of my professional contacts I should prioritize corresponding with.
To this end I have a couple of models:
class Person(models.Model):
name = models.CharField(max_length=256)
email = models.EmailField(blank=True, null=True)
target_contact_interval = models.IntegerField(default=45)
class ContactInstance(models.Model):
person = models.ForeignKey(Person, on_delete=models.CASCADE, related_name='contacts')
date = models.DateField()
notes = models.TextField(blank=True, null=True)
The column target_contact_interval on the Person model generally specifies the maximum amount of days that should pass before I reach out to this person again.
A ContactInstance reflects a single point of contact with a Person. A Person can have a reverse relationship with many ContactInstance objects.
So, the first Person in the queryset should own the greatest difference between the date of the most recent ContactInstance related to it and its own target_contact_interval
So my dream function would look something like:
Person.objects.order_by(contact__latest__date__day - timedelta(days=F(target_contact_interval))
but of course that won't work for a variety of reasons.
I'm sure someone could write up some raw PostgreSQL for this, but I am really curious to know if there is a way to accomplish it using only the Django ORM.
Here are the pieces I've found so far, but I'm having trouble putting them together.
I might be able to use a Subquery to annotate the date of the most recent datapoint:
from django.db.models import OuterRef, Subquery
latest = ContactInstance.objects.filter(person=OuterRef('pk')).order_by('-date')
Person.objects.annotate(latest_contact_date=Subquery(latest.values('date')[:1]))
And I like the idea of sorting the null values at the end:
from django.db.models import F
Person.objects.order_by(F('last_contacted').desc(nulls_last=True))
But I don't know where to go from here. I've been trying to put everything into order_by(), but I can't discern if it is possible to use F() with annotated values or with timedelta in my case.
UPDATE:
I have changed the target_contact_interval model to a DurationField as suggested. Here is the query I am attempting to use:
ci = ContactInstance.objects.filter(
person=OuterRef('pk')
).order_by('-date')
Person.objects.annotate(
latest_contact_date=Subquery(ci.values('date'[:1])
).order_by((
(datetime.today().date() - F('latest_contact_date')) -
F('target_contact_interval')
).desc(nulls_last=True))
It seems to me that this should work, however, the queryset is still not ordering correctly.
Related
If I have a model of an Agent that looks like this:
class Agent(models.Model):
name = models.CharField(max_length=100)
and a related model that looks like this:
class Deal(models.Model):
agent = models.ForeignKey(Agent, on_delete=models.CASCADE)
price = models.IntegerField()
and a view that looked like this:
from django.views.generic import ListView
class AgentListView(ListView):
model = Agent
I know that I can adjust the sort order of the agents in the queryset and I even know how to sort the agents by the number of deals they have like so:
queryset = Agent.objects.all().annotate(uc_count=Count('deal')).order_by('-uc_count')
However, I cannot figure out how to sort the deals by the sum of the price of the deals for each agent.
Given you already know how to annotate and sort by those annotations, you're 90% of the way there. You just need to use the Sum aggregate and follow the relationship backwards.
The Django docs give this example:
Author.objects.annotate(total_pages=Sum('book__pages'))
You should be able to do something similar:
queryset = Agent.objects.all().annotate(deal_total=Sum('deal__price')).order_by('-deal_total')
My spidy sense is telling me you may need to add a distinct=True to the Sum aggregation, but I'm not sure without testing.
Building off of the answer that Greg Kaleka and the question you asked under his response, this is likely the solution you are looking for:
from django.db.models import Case, IntegerField, When
queryset = Agent.objects.all().annotate(
deal_total=Sum('deal__price'),
o=Case(
When(deal_total__isnull=True, then=0),
default=1,
output_field=IntegerField()
)
).order_by('-o', '-deal_total')
Explanation:
What's happening is that the deal_total field is adding up the price of the deals object but if the Agent has no deals to begin with, the sum of the prices is None. The When object is able to assign a value of 0 to the deal_totals that would have otherwise been given the value of None
I'm new to Django and I'm facing a question to which I didn't an answer to on Stackoverflow.
Basically, I have 2 models, Client and Order defined as below:
class Client(models.Model):
name = models.CharField(max_length=200)
registration_date = models.DateTimeField(default=timezone.now)
# ..
class Order(models.Model):
Client = models.ForeignKey(ModelA, on_delete=models.CASCADE, related_name='orders')
is_delivered = models.BooleanField(default=False)
order_date = models.DateTimeField(default=timezone.now)
# ..
I would like my QuerySet clients_results to fulfill the 2 following conditions:
Client objects fill some conditions (for example, their name start with "d" and they registered in 2019, but it could be more complex)
Order objects I can access by using the orders relationship defined in 'related_name' are only the ones that fills other conditions; for example, order is not delivered and was done in the last 6 weeks.
I could do this directly in the template but I feel this is not the correct way to do it.
Additionally, I read in the doc that Base Manager from Order shouldn't be used for this purpose.
Finally, I found a question relatively close to mine using Q and F, but in the end, I would get the order_id while, ideally, I would like to have the whole object.
Could you please advise me on the best way to address this need?
Thanks a lot for your help!
You probably should use a Prefetch(..) object [Django-doc] here to fetch the related non-delivered Orders for each Client, and stores these in the Clients, but then in a different attribute, since otherwise this can generate confusion.
You thus can create a queryset like:
from django.db.models import Prefetch
from django.utils.timezone import now
from datetime import timedelta
last_six_weeks = now() - timedelta(days=42)
clients_results = Client.objects.filter(
name__startswith='d'
).prefetch_related(
Prefetch(
'orders',
Order.objects.filter(is_delivered=False, order_date__gte=last_six_weeks),
to_attr='nondelivered_orders'
)
)
This will contain all Clients where the name starts with 'd', and each Client object that arises from this queryset will have an attribute nondelivered_orders that contains a list of Orders that are not delivered, and ordered in the last 42 days.
Let's say we have the following simplistic models:
class Category(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "categories"
class Status(models.Model):
name = models.CharField(max_length=264)
def __str__(self):
return self.name
class Meta:
verbose_name_plural = "status"
class Product(models.Model):
title = models.CharField(max_length=264)
description = models.CharField(max_length=264)
category = models.ForeignKey(Category, on_delete=models.CASCADE)
price = models.DecimalField(max_digits=10)
status = models.ForeignKey(Status, on_delete=models.CASCADE)
My aim is to get some statistics, like total products, total sales, average sales etc, based on which price bin each product belongs to.
So, the price bins could be something like 0-100, 100-500, 500-1000, etc.
I know how to use pandas to do something like that:
Binning column with python pandas
I am searching for a way to do this with the Django ORM.
One of my thoughts is to convert the queryset into a list and apply a function to get the apropriate price bin and then do the statistics.
Another thought which I am not sure how to impliment, is the same as the one above but just apply the bin function to the field in the queryset I am interested in.
There are three pathways I can see.
First is composing the SQL you want to use directly and putting it to your database with a modification of your models manager class. .objects.raw("[sql goes here]"). This answer shows how to define group with a simple function on the content - something like that could work?
SELECT FLOOR(grade/5.00)*5 As Grade,
COUNT(*) AS [Grade Count]
FROM TableName
GROUP BY FLOOR(Grade/5.00)*5
ORDER BY 1
Second is that there is no reason you can't move the queryset (with .values() or .values_list()) into a pandas dataframe or similar and then bin it, as you mentioned. There is probably a bit of an efficiency loss in terms of getting the queryset into a dataframe and then processing it, but I am not sure that it would certainly or always be bad. If its easier to compose and maintain, that might be fine.
The third way I would try (which I think is what you really want) is chaining .annotate() to label points with the bin they belong in, and the aggregate count function to count how many are in each bin. This is more advanced ORM work than I've done, but I think you'd start looking at something like the docs section on conditional aggregation. I've adapted this slightly to create the 'price_class' column first, with annotate.
Product.objects.annotate(price_class=floor(F('price')/100).aggregate(
class_zero=Count('pk', filter=Q(price_class=0)),
class_one=Count('pk', filter=Q(price_class=1)),
class_two=Count('pk', filter=Q(price_class=2)), # etc etc
)
I'm not sure if that 'floor' is going to work, and you may need 'expression wrapper' to ensure the push price_class into the write type of output_field. All the best.
Lets say i have two model
class Testmodel1():
amount = models.IntegerField(null=True)
contact = models.CharField()
entry_time = models.DateTimeField()
class Testmodel2():
name = models.CharField()
mobile_no = models.ForeignKey(Testmodel1)
and I am creating the object for this model(Testmodel2). Now I want to find out the count of object(Testmodel2) created in last 24 hours by mobile_no field.
what could be the best way of making query.
Any help would be appreciated.
It'd be better if you made the contact field into a models.DateTime field rather than a models.CharField. If it were a DateTime field, you could do lte, gte, and other operations on it easily to compare it to other datetimes.
For example, if Testmodel.contact were a DateTime field, the answer to your question would be:
Testmodel.objects.filter(contact__gte=past).count()
If the contact field contains a string representing a DateTime, I'd recommend switching it over, since there's really no reason to store it as a string.
If you're unable to change these fields, unfortunately I don't think there's a way to do this on the database level. You'll have to filter them individually on the python side:
from dateutil.parser import parse
results = []
past = arrow.utcnow().shift(hours=-24)
model_query = TestModel.objects.all()
for obj in model_query.iterator():
contact_date = parse(obj.contact) # Parse string into datetime
if contact_date > past:
results.append(obj)
print(len(results))
This will give you a list (note: NOT a queryset) containing all matching model instances. It'll be a lot slower than the other option would be, you can't edit the results afterwards with something like results.filter(amount__gte=1).count(), and it's not quite as clean.
That said, it'll get the job done.
EDIT
It occurs to me that this might be able to be done with annotation, but I'm not sure how that would be accomplished, or if it would even work. I defer to other answers if they can think of a way to use annotation to accomplish this in a better way, but stick to my original assessment that this should probably be a DateTime field.
EDIT 2
With a DateTime field now added on the other model, you can look it up across models like so:
past = arrow.utcnow().shift(hours=-24)
Testmodel2.objects.filter(mobile_no__entry_time__gte=past)
I have a simple hierarchic model whit a Person and RunningScore as child.
this model store data about running score of many user, simplified something like:
class Person(models.Model):
firstName = models.CharField(max_length=200)
lastName = models.CharField(max_length=200)
class RunningScore(models.Model):
person = models.ForeignKey('Person', related_name="scores")
time = models.DecimalField(max_digits=6, decimal_places=2)
If I get a single Person it cames with all RunningScores associated to it, and this is standard behavior. My question is really simple: if I'd like to get a Person with only a RunningScore child (suppose the better result, aka min(time) ) how can I do?
I read the official Django documentation but have not found a
solution.
I am not 100% sure if I get what you mean, but maybe this will help:
from django.db.models import Min
Person.objects.annotate(min_running_time=Min('time'))
The queryset will fetch Person objects with min_running_time additional attribute.
You can also add a filter:
Person.objects.annotate(min_running_time=Min('time')).filter(firstName__startswith='foo')
Accessing the first object's min_running_time attribute:
first_person = Person.objects.annotate(min_running_score=Min('time'))[0]
print first_person.min_running_time
EDIT:
You can define a method or a property such as the following one to get the related object:
class Person(models.Model):
...
#property
def best_runner(self):
try:
return self.runningscore_set.order_by('time')[0]
except IndexError:
return None
If you want one RunningScore for only one Person you could use odering and limit your queryset to 1 object.
Something like this:
Person.runningscore_set.order_by('-time')[0]
Here is the doc on limiting querysets:
https://docs.djangoproject.com/en/1.3/topics/db/queries/#limiting-querysets