I am writing an API endpoint that queries for an Event in the time window given.
The Event model has a start and end time, and the query accepts a search start time and an search end time. Any Event that overlaps between these two windows is returned. This works well.
However, if the user asks to search using a search_time_start that is after the search_time_end, it returns no results (obviously), with no indication why.
Instead I would like to return an Error message indicating what is wrong.
I have a rough idea of how to do this, but not how to plug it into my existing code.
Model
class Event(models.Model):
name = model.CharField(max_length=64)
start_time = models.DateTimeField(null=False)
end_time = models.DateTimeField(null=False)
Serializer
class EventSerializer(serializers.ModelSerializer):
start_time = serializers.DateTimeField()
end_time = serializers.DateTimeField()
class Meta:
model = Event
fields = ("start_time", "end_time")
Filter and View
class EventFilterSet(GeoFilterSet):
search_time_start = DateTimeFilter(
field_name="end_time",
lookup_expr="gte",
help_text=(
"The time for the start of the search. Limits events to those that end after the start time."
),
)
search_time_end = DateTimeFilter(
field_name="start_time",
lookup_expr="lte",
help_text=(
"The time for the end of the search. Limits events to those that start before the end time."
),
)
class Meta:
model = Event
fields = ["search_time_start", "search_time_end"]
class BaseEventViewSet(GenericViewSet):
queryset = Event.objects.all()
serializer_class = EventSerializer
filterset_class = EventFilterSet
ordering = ("start_time",)
I have written this snippet which I think should work, but I am unsure if it should go in a view or serializer, and how to make sure the method is actually called. I suspect I may be incorrectly mixing Class-based View and Function-based views here... so perhaps there's a better way to accomplish this.
def check_valid_start_end_search(self, request):
start = request.query_params.get("start_time")
end = request.query_params.get("end_time")
if not start < end:
return Response({"error": "The start time must be after the end time"}, status=HTTP_400_BAD_REQUEST)
I've seen some similar quesitons here, but they all deal with function-based views or handling the same error but on a model field, not a query param
You can raise custom errors from your serializers also to validate start time and end time.
try this solution.
class EventSerializer(serializers.ModelSerializer):
start_time = serializers.DateTimeField()
end_time = serializers.DateTimeField()
class Meta:
model = Event
fields = ("start_time", "end_time")
def validate(self,attrs):
start_time=attrs.get('start_time')
end_time=attrs.get('end_time')
if start_time < end_time:
raise serializers.ValidationError('The start time must be after the end time')
return attrs
Related
So i got my serializer called like this:
result_serializer = TaskInfoSerializer(tasks, many=True)
And the serializer:
class TaskInfoSerializer(serializers.ModelSerializer):
done_jobs_count = serializers.SerializerMethodField()
total_jobs_count = serializers.SerializerMethodField()
task_status = serializers.SerializerMethodField()
class Meta:
model = Task
fields = ('task_id', 'task_name', 'done_jobs_count', 'total_jobs_count', 'task_status')
def get_done_jobs_count(self, obj):
qs = Job.objects.filter(task__task_id=obj.task_id, done_flag=1)
condition = False
# Some complicate logic to determine condition that I can't reveal due to business
result = qs.count() if condition else 0
# this function take around 3 seconds
return result
def get_total_jobs_count(self, obj):
qs = Job.objects.filter(task__task_id=obj.task_id)
# this query take around 3-5 seconds
return qs.count()
def get_task_status(self, obj):
done_count = self.get_done_jobs_count(obj)
total_count = self.get_total_jobs_count(obj)
if done_count >= total_count:
return 'done'
else:
return 'not yet'
When the get_task_status function is called, it call other 2 function and make those 2 costly query again.
Is there any best way to prevent that? And I dont really know the order of those functions to be called, is it based on the order declare in Meta's fields? Or above that?
Edit:
The logic in get_done_jobs_count is a bit complicate and I cannot make it into a single query when get task
Edit 2:
I just bring all those count function into model and use cached_property
https://docs.djangoproject.com/en/2.1/ref/utils/#module-django.utils.functional
But it raise another question: Is that number reliable? I don't understand much about django cache, is that cached_property is only exist for this instance (just until the API get list of tasks return a response) or will it exist for sometime?
I just try cached_property and it did resolve the problem.
Model:
from django.utils.functional import cached_property
from django.db import models
class Task(models.Model):
task_id = models.AutoField(primary_key=True)
task_name = models.CharField(default='')
#cached_property
def done_jobs_count(self):
qs = self.jobs.filter(done_flag=1)
condition = False
# Some complicate logic to determine condition that I can't reveal due to business
result = qs.count() if condition else 0
# this function take around 3 seconds
return result
#cached_property
def total_jobs_count(self):
qs = Job.objects.filter(task__task_id=obj.task_id)
# this query take around 3-5 seconds
return qs.count()
#property
def task_status(self):
done_count = self.done_jobs_count
total_count = self.total_jobs_count
if done_count >= total_count:
return 'done'
else:
return 'not yet'
Serializer:
class TaskInfoSerializer(serializers.ModelSerializer):
class Meta:
model = Task
fields = ('task_id', 'task_name', 'done_jobs_count', 'total_jobs_count', 'task_status')
You could annotate those values to avoid making extra queries. So the queryset passed to the serializer would look something like this (it might change depending on the Django version you're using and the related query name for jobs):
tasks = tasks.annotate(
done_jobs=Count('jobs', filter=Q(done_flag=1)),
total_jobs=Count('jobs'),
)
result_serializer = TaskInfoSerializer(tasks, many=True)
Then the serializer method would look like:
def get_task_status(self, obj):
if obj.done_jobs >= obj.total_jobs:
return 'done'
else:
return 'not yet'
Edit: a cached_property won't help you if you have to call the method for each task instance (which seems the case). The problem is not so much the calculation but whether you have to hit the database for each separate task. You have to focus on getting all the information needed for the calculation in a single query. If that's impossible or too complicated, maybe think about changing the data structure (models) in order to facilitate that.
Using iterator() and counting Iterator might solve your problem.
job_iter = Job.objects.filter(task__task_id=obj.task_id).iterator()
count = len(list(job_iter))
return count
You can use select_related() and prefetch_related() for Retrieve everything at once if you will need them.
Note: if you use iterator() to run the query, prefetch_related() calls will be ignored
You might want to go through documentation for optimisation
I have a History model like below
class History(models.Model):
class Meta:
app_label = 'subscription'
ordering = ['-start_datetime']
subscription = models.ForeignKey(Subscription, related_name='history')
FREE = 'free'
Premium = 'premium'
SUBSCRIPTION_TYPE_CHOICES = ((FREE, 'Free'), (Premium, 'Premium'),)
name = models.CharField(max_length=32, choices=SUBSCRIPTION_TYPE_CHOICES, default=FREE)
start_datetime = models.DateTimeField(db_index=True)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
cancelled_datetime = models.DateTimeField(blank=True, null=True)
Now i have a queryset filtering like below
users = get_user_model().objects.all()
queryset = users.exclude(subscription__history__end_datetime__lt=timezone.now())
The issue is that in the exclude above it is checking end_datetime for all the rows for a particular history object. But i only want to compare it with first row of history object.
Below is how a particular history object looks like. So i want to write a queryset filter which can do datetime comparison on first row only.
You could use a Model Manager method for this. The documentation isn't all that descriptive, but you could do something along the lines of:
class SubscriptionManager(models.Manager):
def my_filter(self):
# You'd want to make this a smaller query most likely
subscriptions = Subscription.objects.all()
results = []
for subscription in subscriptions:
sub_history = subscription.history_set.first()
if sub_history.end_datetime > timezone.now:
results.append(subscription)
return results
class History(models.Model):
subscription = models.ForeignKey(Subscription)
end_datetime = models.DateTimeField(db_index=True, blank=True, null=True)
objects = SubscriptionManager()
Then: queryset = Subscription.objects().my_filter()
Not a copy-pastable answer, but shows the use of Managers. Given the specificity of what you're looking for, I don't think there's a way to get it just via the plain filter() and exclude().
Without knowing what your end goal here is, it's hard to say whether this is feasible, but have you considered adding a property to the subscription model that indicates whatever you're looking for? For example, if you're trying to get everyone who has a subscription that's ending:
class Subscription(models.Model):
#property
def ending(self):
if self.end_datetime > timezone.now:
return True
else:
return False
Then in your code: queryset = users.filter(subscription_ending=True)
I have tried django's all king of expressions(aggregate, query, conditional) but was unable to solve the problem so i went with RawSQL and it solved the problem.
I have used the below SQL to select the first row and then compare the end_datetime
SELECT (end_datetime > %s OR end_datetime IS NULL) AS result
FROM subscription_history
ORDER BY start_datetime DESC
LIMIT 1;
I will select my answer as accepted if not found a solution with queryset filter chaining in next 2 days.
I'm working on a django project with the following models.
class User(models.Model):
pass
class Item(models.Model):
user = models.ForeignKey(User)
item_id = models.IntegerField()
There are about 10 million items and 100 thousand users.
My goal is to override the default admin search that takes forever and
return all the matching users that own "all" of the specified item ids within a reasonable timeframe.
These are a couple of the tests I use to better illustrate my criteria.
class TestSearch(TestCase):
def search(self, searchterm):
"""A tuple is returned with the first element as the queryset"""
return do_admin_search(User.objects.all())
def test_return_matching_users(self):
user = User.objects.create()
Item.objects.create(item_id=12345, user=user)
Item.objects.create(item_id=67890, user=user)
result = self.search('12345 67890')
assert_equal(1, result[0].count())
assert_equal(user, result[0][0])
def test_exclude_users_that_do_not_match_1(self):
user = User.objects.create()
Item.objects.create(item_id=12345, user=user)
result = self.search('12345 67890')
assert_false(result[0].exists())
def test_exclude_users_that_do_not_match_2(self):
user = User.objects.create()
result = self.search('12345 67890')
assert_false(result[0].exists())
The following snippet is my best attempt using annotate that takes over 50 seconds.
def search_by_item_ids(queryset, item_ids):
params = {}
for i in item_ids:
cond = Case(When(item__item_id=i, then=True), output_field=BooleanField())
params['has_' + str(i)] = cond
queryset = queryset.annotate(**params)
params = {}
for i in item_ids:
params['has_' + str(i)] = True
queryset = queryset.filter(**params)
return queryset
Is there anything I can do to speed it up?
Here's some quick suggestions that should improve performance drastically.
Use prefetch_related` on the initial queryset to get related items
queryset = User.objects.filter(...).prefetch_related('user_set')
Filter with the __in operator instead of looping through a list of IDs
def search_by_item_ids(queryset, item_ids):
return queryset.filter(item__item_id__in=item_ids)
Don't annotate if it's already a condition of the query
Since you know that this queryset only consists of records with ids in the item_ids list, no need to write that per object.
Putting it all together
You can speed up what you are doing drastically just by calling -
queryset = User.objects.filter(
item__item_id__in=item_ids
).prefetch_related('user_set')
with only 2 db hits for the full query.
I am trying to build a messaging application with Django. The reason I don’t use postman is that I need messaging between other objects than users and I don’t need most of the postman’s features.
Here are my models:
class Recipient(models.Model):
...
def get_unread_threads():
see below...
class Thread(models.Model):
author = models.ForeignKey(Recipient, related_name='initiated_threads')
recipients = models.ManyToManyField(
Tribe,
related_name='received_thread',
through='ThreadReading')
subject = models.CharField(max_length=64)
class Meta:
app_label = 'game'
class Message(models.Model):
author = models.ForeignKey(Recipient, related_name='sent_messages')
date_add = models.DateTimeField(auto_now_add=True)
date_edit = models.DateTimeField(blank=True)
content = models.TextField(max_length=65535)
thread = models.ForeignKey(Thread)
class Meta:
get_latest_by = 'date'
class ThreadReading(models.Model):
thread = models.ForeignKey(Thread)
recipient = models.ForeignKey(Recipient)
date_last_reading = models.DateTimeField(auto_now=True)
My problem is about get_unread_threads. I can’t really find out how to do that. Here is a first try:
def get_unread_threads(self):
"""
return a queryset corresponding to the threads
which at least one message is unread by the recipient
"""
try:
query = self.received_thread.filter(
message__latest__date_add__gt=\
self.threadreading_set.get(thread_id=F('id')).date_last_reading)
except ObjectDoesNotExist:
query = None
return query
But obviously it doesn’t work because lookup can’t follow method latest.
Here you go:
# Get all the readings of the user
thread_readings = recipient.threadreading_set.all()
# Build a query object including all messages who's last reading is after the
# last edit that was made AND whose thread is the thread of the current
# iteration's thread reading
q = models.Q()
for thread_reading in thread_readings:
q = q | models.Q(
models.Q(
date_edit__lte=thread_reading.date_last_reading
& models.Q(
thread=thread_reading.thread
)
)
)
# Get a queryset of all the messages, including the threads (via a JOIN)
queryset = Message.objects.select_related('thread')
# Now, exclude from the queryset every message that matches the above query
# (edited prior to last reading) OR that is in the list of last readings
queryset = queryset.exclude(
q | models.Q(
thread__pk__in=[thread_reading.pk for thread_reading in thread_readings]
)
)
# Make an iterator (to pretend that this is efficient) and return a generator of it
iterator = queryset.iterator()
return (message.thread for message in iterator)
:)
Now, don't ever actually do this - rethink your models. I would read a book called "Object Oriented Analysis and Design with Applications". It'll teach you a great deal about how to think when you're data modelling.
I have some complex business logic that I have placed in a custom ModelManager. The manager method returns a tuple of values rather than a queryset. Is this considered bad practice? if so, what is the recommended approach. I do not want the logic in the View, and Django has no Service tier. Plus, my logic needs to potentially perform multiple queries.
The logic needs to select an Event closest to the current time, plus 3 events either side. When placed in the template, it is helpful to know the closest event as this is the event initially displayed in a full-screen slider.
The current call is as follows:
closest_event, previous_events, next_events = Event.objects.closest()
The logic does currently work fine. I am about to convert my app. to render the Event data as JSON in the template so that I can bootstrap a backbone.js View on page load. I plan to use TastyPie to render a Resource server-side into the template. Before I refactor my code, it would be good to know my current approach is not considered bad practice.
This is how my app. currently works:
views.py
class ClosestEventsListView(TemplateView):
template_name = 'events/event_list.html'
def get(self, request, *args, **kwargs):
context = self.get_context_data(**kwargs)
closest_event, previous_events, next_events = Event.objects.closest()
context['closest_event'] = closest_event
context['previous_events'] = previous_events
context['next_events'] = next_events
return self.render_to_response(context)
models.py
from datetime import timedelta
from django.db import models
from django.utils import timezone
from model_utils.models import TimeStampedModel
class ClosestEventsManager(models.Manager):
def closest(self, **kwargs):
"""
We are looking for the closest event to now plus the 3 events either side.
First select by date range until we have a count of 7 or greater
Initial range is 1 day eithee side, then widening by another day, if required
Then compare delta for each event data and determine the closest
Return closest event plus events either side
"""
now = timezone.now()
range_in_days = 1
size = 0
while size < 7:
start_time = now + timedelta(days=-range_in_days)
end_time = now + timedelta(days=range_in_days)
events = self.filter(date__gte=start_time, date__lte=end_time, **kwargs).select_related()
size = events.count()
range_in_days += 1
previous_delta = None
closest_event = None
previous_events = None
next_events = None
position = 0
for event in events:
delta = (event.date - now).total_seconds()
delta = delta * -1 if delta < 0 else delta
if previous_delta and previous_delta <= delta:
# we have found the closest event. Now, based on
# position get events either size
next_events = events[:position-1]
previous_events = events[position:]
break
previous_delta = delta
closest_event = event
position += 1
return closest_event, previous_events, next_events
class Event(TimeStampedModel):
class Meta:
ordering = ['-date']
topic = models.ForeignKey(Topic)
event_type = models.ForeignKey(EventType)
title = models.CharField(max_length=100)
slug = models.SlugField()
date = models.DateTimeField(db_index=True)
end_time = models.TimeField()
location = models.ForeignKey(Location)
twitter_hashtag = models.CharField(null=True, blank=True, max_length=100)
web_link = models.URLField(null=True, blank=True)
objects = ClosestEventsManager()
def __unicode__(self):
return self.title
I don't think it's bad practice to return a tuple. The first example in the ModelManager docs returns a list.
Saying that, if you want to build a queryset instead then you could do something like this -
def closest(self, **kwargs):
# get the events you want
return self.filter(pk__in=([event.id for event in events]))
It's fine, even Django's own get_or_create does it. Just make sure it's clear to whoever's using the function that it's not chainable (ie doesn't return a queryset).