I have the following model in django:
class task(models.Model):
admin = models.BooleanField()
date = modesl.DateField()
I am trying to achieve a filter which provides me with a query_set that prioritize if admin = True
So assume I have these 3 rows :
admin = True , date = 01-01-2019
admin = False , date = 01-01-2019
admin = False , date = 02-02-2019
The output of the query set will be :
admin = True , date = 01-01-2019
admin = False , date = 02-02-2019
it should filter out the row with 01-01-2019 which admin=False because there is already an admin=True row which should take prioritization.
I could do it by fetching all and removing it from the query_set myself, but want to make sure there is no other way of doing it before.
Rather than looping through the QuerySet and removing them yourself, one thing you could do is:
Fetch all the dates where admin is True
Fetch all the objects where either:
i. admin is True
ii. The date is not in part 1 (e.g. admin is False)
This can be achieved with the following:
from django.db.models import Q
true_dates = task.objects.filter(admin=True).values_list("date", flat=True)
results = task.objects.filter(Q(admin=True)|~Q(date__in=true_dates))
This will most likely be more efficient than looping through your results yourself.
Note that since querysets are 'lazy' (this means only evaluated when they absolutely have to be) this will result in just 1 db hit
Tim's answer is close, but incomplete, because he doesn't use Subquery().
This answer provides the same results, without having an additional query hit the database:
from django.db.models import Subquery, Q
dates = Task.objects.filter(admin=True)
tasks = Task.objects.filter(Q(admin=True) | ~Q(date__in=Subquery(dates.values('date')))
Related
I am trying to find a better way to loop over orders for the next seven days including today, what I have already:
unfilled_orders_0 = Model.objects.filter(delivery_on__date=timezone.now() + timezone.timedelta(0))
context['todays_orders'] = unfilld_orders_0.aggregate(field_1_sum=Sum('field_1'), field_2_sum=Sum('field_2'),field_3_sum=Sum('field_3'), field_4_sum=Sum('field_4'),field_5_sum=Sum('field_5'))
I'm wondering if I can somehow avoid having to do this seven times--one for each day. I assume there is a more efficient way to do this.
You can do this with a single ORM / db query, by providing Sum with an extra filter argument:
days_ahead = 7
fields = ["field_1", "field_2", ...]
aggregate_kwargs = {
f"s_{field}_{day}": Sum(field, filter=Q(delivery_on__date=now+timedelta(days=day)))
for field in fields
for day in range(days_ahead)
}
unfilled_orders = Model.objects.filter(delivery_on__date__lt=now+timedelta(days=days_ahead)
context.update(unfilled_orders.aggregate(**aggregate_kwargs))
You can approach it with a for loop and store the data in the context like so
from django.utils import timezone
from django.db.models import Sum
context = {}
for i in range(7):
qs = Model.objects.filter(delivery_on__date=(timezone.now() + timezone.timedelta(i)).date())
context = {}
context[f'orders_{i}'] = qs.aggregate(
field_1_sum=Sum('field_1'),
field_2_sum=Sum('field_2'),
field_3_sum=Sum('field_3'),
field_4_sum=Sum('field_4'),
field_5_sum=Sum('field_5'))
This query will hit 7 times the database, otherwise you can use another approach which will hit the db only once
context = {}
qs = Model.objects.filter(delivery_on__date__lte=timezone.now()-timezone.timedelta(days=7)).order_by('delivery_on')
dates = qs.values('delivery_on__date', flat=True).distinct()
for i in dates:
_qs = qs.filter(create_ts__date=i)
context[f'orders_{i}'] = _qs.aggregate(
field_1_sum=Sum('field_1'),
field_2_sum=Sum('field_2'),
field_3_sum=Sum('field_3'),
field_4_sum=Sum('field_4'),
field_5_sum=Sum('field_5'))
You define how many days backwards the qs will be including all orders then distinct the dates and filter the already filtered qs for the dates.
I have Users who take Surveys periodically. The system has multiple surveys which it issues at set intervals from the submitted date of the last issued survey of that particular type.
class Survey(Model):
name = CharField()
description = TextField()
interval = DurationField()
users = ManyToManyField(User, related_name='registered_surveys')
...
class SurveyRun(Model):
''' A users answers for 1 taken survey '''
user = ForeignKey(User, related_name='runs')
survey = ForeignKey(Survey, related_name='runs')
created = models.DateTimeField(auto_now_add=True)
submitted = models.DateTimeField(null=True, blank=True)
# answers = ReverseForeignKey...
So with the models above a user should be alerted to take survey A next on this date:
A.interval + SurveyRun.objects.filter(
user=user,
survey=A
).latest('submitted').submitted
I want to run a daily periodic task which queries all users and creates new runs for all users who have a survey due according to this criteria:
For each survey the user is registered:
if no runs exist for that user-survey combo then create the first run for that user-survey combination and alert the user
if there are runs for that survey and none are open (an open run has been created but not submitted so submitted=None) and the latest one's submitted date plus the survey's interval is <= today, create a new run for that user-survey combo and alert the user
Ideally I could create a manager method which would annotate with a surveys_due field like:
users_with_surveys_due = User.objects.with_surveys_due().filter(surveys_due__isnull=False)
Where the annotated field would be a queryset of Survey objects for which the user needs to submit a new round of answers.
And I could issue alerts like this:
for user in users_with_surveys_due.all():
for survey in user.surveys_due:
new_run = SurveyRun.objects.create(
user=user,
survey=survey
)
alert_user(user, run)
However I would settle for a boolean flag annotation on the User object indicating one of the registered_surveys needs to create a new run.
How would I go about implementing something like this with_surveys_due() manager method so Postgres does all the heavy lifting? Is it possible to annotate with a collection objects, like a reverse FK?
UPDATE:
For clarity here is my current task in python:
def make_new_runs_and_alert_users():
runs = []
Srun = apps.get_model('surveys', 'SurveyRun')
for user in get_user_model().objects.prefetch_related('registered_surveys', 'runs').all():
for srvy in user.registered_surveys.all():
runs_for_srvy = user.runs.filter(survey=srvy)
# no runs exist for this registered survey, create first run
if not runs_for_srvy.exists():
runs.append(Srun(user=user, survey=srvy))
...
# check this survey has no open runs
elif not runs_for_srvy.filter(submitted=None).exists():
latest = runs_for_srvy.latest('submitted')
if (latest.submitted + qnr.interval) <= timezone.now():
runs.append(Srun(user=user, survey=srvy))
Srun.objects.bulk_create(runs)
UPDATE #2:
In attempting to use Dirk's solution I have this simple example:
In [1]: test_user.runs.values_list('survey__name', 'submitted')
Out[1]: <SurveyRunQuerySet [('Test', None)]>
In [2]: test_user.registered_surveys.values_list('name', flat=True)
Out[2]: <SurveyQuerySet ['Test']>
The user has one open run (submitted=None) for the Test survey and is registered to one survey (Test). He/She should not be flagged for a new run seeing as there is an un-submitted run outstanding for the only survey he/she is registered for. So I create a function encapsulating the Dirk's solution called get_users_with_runs_due:
In [10]: get_users_with_runs_due()
Out[10]: <UserQuerySet [<User: test#gmail.com>]> . # <-- should be an empty queryset
In [107]: for user in _:
print(user.email, i.has_survey_due)
test#gmail.com True # <-- should be false
UPDATE #3:
In my previous update I had made some changes to the logic to properly match what I wanted but neglected to mention or show the changes. Here is the query function below with comments by the changes:
def get_users_with_runs_due():
today = timezone.now()
survey_runs = SurveyRun.objects.filter(
survey=OuterRef('pk'),
user=OuterRef(OuterRef('pk'))
).order_by('-submitted')
pending_survey_runs = survey_runs.filter(submitted__isnull=True)
surveys = Survey.objects.filter(
users=OuterRef('pk')
).annotate(
latest_submission_date=Subquery(
survey_runs.filter(submitted__isnull=False).values('submitted')[:1]
)
).annotate(
has_survey_runs=Exists(survey_runs)
).annotate(
has_pending_runs=Exists(pending_survey_runs)
).filter(
Q(has_survey_runs=False) | # either has no runs for this survey or
( # has no pending runs and submission date meets criteria
Q(has_pending_runs=False, latest_submission_date__lte=today - F('interval'))
)
)
return User.objects.annotate(has_survey_due=Exists(surveys)).filter(has_survey_due=True)
UPDATE #4:
I tried to isolate the issue by creating a function which would make most of the annotations on the Surveys by user in an attempt to check the annotation on that level prior to querying the User model with it.
def annotate_surveys_for_user(user):
today = timezone.now()
survey_runs = SurveyRun.objects.filter(
survey=OuterRef('pk'),
user=user
).order_by('-submitted')
pending_survey_runs = survey_runs.filter(submitted=None)
return Survey.objects.filter(
users=user
).annotate(
latest_submission_date=Subquery(
survey_runs.filter(submitted__isnull=False).values('submitted')[:1]
)
).annotate(
has_survey_runs=Exists(survey_runs)
).annotate(
has_pending_runs=Exists(pending_survey_runs)
)
This worked as expected. Where the annotations were accurate and filtering with:
result.filter(
Q(has_survey_runs=False) |
(
Q(has_pending_runs=False) &
Q(latest_submission_date__lte=today - F('interval'))
)
)
produced the desired results: An empty queryset where the user should not have any runs due and vice-versa. Why is this not working when making it the subquery and querying from the User model?
To annotate users with whether or not they have a survey due, I'd suggest to use a Subquery expression:
from django.db.models import Q, F, OuterRef, Subquery, Exists
from django.utils import timezone
today = timezone.now()
survey_runs = SurveyRun.objects.filter(survey=OuterRef('pk'), user=OuterRef(OuterRef('pk'))).order_by('-submitted')
pending_survey_runs = survey_runs.filter(submitted__isnull=True)
surveys = Survey.objects.filter(users=OuterRef('pk'))
.annotate(latest_submission_date=Subquery(survey_runs.filter(submitted__isnull=False).values('submitted')[:1]))
.annotate(has_survey_runs=Exists(survey_runs))
.annotate(has_pending_runs=Exists(pending_survey_runs))
.filter(Q(has_survey_runs=False) | Q(latest_submission_date__lte=today - F('interval')) & Q(has_pending_runs=False))
User.objects.annotate(has_survey_due=Exists(surveys))
.filter(has_survey_due=True)
I'm still trying to figure out how to do the other one. You cannot annotate a queryset with another queryset, values must be field equivalents. Also you cannot use a Subquery as queryset parameter to Prefetch, unfortunately. But since you're using PostgreSQL you could use ArrayField to list the ids of the surveys in a wrapped value, but I haven't found a way to do that, as you can't use aggregate inside a Subquery.
Imagine I have the following 2 models in a contrived example:
class User(models.Model):
name = models.CharField()
class Login(models.Model):
user = models.ForeignKey(User, related_name='logins')
success = models.BooleanField()
datetime = models.DateTimeField()
class Meta:
get_latest_by = 'datetime'
How can I get a queryset of Users, which only contains users whose last login was not successful.
I know the following does not work, but it illustrates what I want to get:
User.objects.filter(login__latest__success=False)
I'm guessing I can do it with Q objects, and/or Case When, and/or some other form of annotation and filtering, but I can't suss it out.
We can use a Subquery here:
from django.db.models import OuterRef, Subquery
latest_login = Subquery(Login.objects.filter(
user=OuterRef('pk')
).order_by('-datetime').values('success')[:1])
User.objects.annotate(
latest_login=latest_login
).filter(latest_login=False)
This will generate a query that looks like:
SELECT auth_user.*, (
SELECT U0.success
FROM login U0
WHERE U0.user_id = auth_user.id
ORDER BY U0.datetime DESC
LIMIT 1
) AS latest_login
FROM auth_user
WHERE (
SELECT U0.success
FROM login U0
WHERE U0.user_id = auth_user.id
ORDER BY U0.datetime
DESC LIMIT 1
) = False
So the outcome of the Subquery is the success of the latest Login object, and if that is False, we add the related User to the QuerySet.
You can first annotate the max dates, and then filter based on success and the max date using F expressions:
User.objects.annotate(max_date=Max('logins__datetime'))\
.filter(logins__datetime=F('max_date'), logins__success=False)
for check bool use success=False and for get latest use latest()
your filter has been look this:
User.objects.filter(success=False).latest()
I have
Class Profile(models.Model)
turn_off_date = models.DateTimeField(null= True, blank = True,auto_now = False)
I need to find all profile records, that have expiration date= Exactly x days from now.
How do I do this?
I can think of iterating through all profiles and manually comparing dates, but it seems not effective
Update:
I neeed to filter by date, not by datetime field as in duplicate question
Right now I am doing it like this:
profiles = Profile.objects.all()
for profile in profiles:
if(profile.days_left() == x):
print(profile.days_left())
And in my Profile model I defined a method:
def days_left(self):
turn_off_date = self.turn_off_date
days_left = (turn_off_date - datetime.now()).days
if days_left < 0:
days_left = 0
return days_left
You can just query by __date with a given date
For datetime fields, casts the value as date. Allows chaining additional field lookups. Takes a date value.
.filter(turn_off_date__date=(datetime.now()+timedelta(days=x)).date())
You can use range to achieve this,
Example:
Profile.objects.filter(turn_off_date__range=(start_date, end_date))
I guess something like this will work fine:
From Django 1.9 you can use the __date field lookup, exactly as you
have mentioned in your question. For older versions, you will have to
do with the other methods.
Thanks to https://stackoverflow.com/a/45735324/2950593
I need to find data within a certain set of parameters
I am building a small booking system, that lets user see what vehicles are available for booking for their little safari trip.
The system has bookings that have been entered previously or made previously by a client.
If a booking's pickup_date = 2011-03-01 and dropoff_date = 2011-03-15 and I run a query with pickup=2011-03-09 and dropoff=2011-03-14 in my views as below, it doesn't return any results to see if a booking within that timeframe has been made.
views.py
def dates(request, template_name='groups/_dates.html'):
pickup=request.GET.get('pickup','None');
dropoff=request.GET.get('dropoff','None');
order = Order.objects.filter(pick_up__lte=pickup).filter(drop_off__gte=dropoff)
context = {'order':order,}
return render_to_response(template_name,context,
context_instance=RequestContext(request))
Any suggestions on how to do this?
Or should I be looking at an alternate way of running this query?
Could it be posible that as your passing the raw string to the queryset is not on the correct format, try converting the strings to datetime objects.
Later you can try using the range lookup is more efficient on some DB engines and more simple to read and code.
from django.db.models import Q
start_date = datetime.date(2005, 1, 1)
end_date = datetime.date(2005, 3, 31)
orders = Order.objects.filter(drop_off__gte=start_date, pick_up__lte=end_date)
# Or maybe better
orders = Order.objects.filter(Q(drop_off__gte=start_date), Q(pick_up__lte=end_date))
Can you try this :
order = Order.objects.filter(pick_up**__date__**lte=pickup).filter(drop_off**__date__**gte=dropoff)
https://docs.djangoproject.com/fr/2.0/ref/models/querysets/#date