Comparing Date to Minimum Future Value in Other Model - django

I have two django models that define:
Item = A set of items with an expiry date.
Event = A set of events that have a start and end date.
My aim is that when the item is displayed, its expiry date is shown conditionally formatted, based on whether that item expires after the next event's end date but before the following events end date (so warning when it's due to expire.)
The problem becomes, how best to manage this feat?
If I was directly accessing the database, I'd be using subqueries to get the minimum end date still in the future, and then comparing that to the expiry date on an if basis to swap the formatting.
From research, I'm coming to the conclusion that this is best handled in the view logic, rather than trying to set a method in either of the models, but I don't know how best to get this to splice together so I can return the value of my minimum future date and my itemlist object from the Item model.
The current view is pretty simple at this stage (it will have more filtering options later):
def itemlist(request):
item_list = Item.objects.all
return render(request, "itemlist.html", {'item_list': item_list})
but I cant see a way of easily returning a django equivalent of something like what I'd do in straight SQL:
select item from items where status != expired and expiry_date <= (select min(end_date) from events where end_date >= getdate() )
EDIT: Since I've written this, I've realised the comparison for what I want is a little more complex, as it's not the minimum date, it's the next to minimum.
For Item A, expiry_date 01/05/19
Event A: end_date 25/04/19
Event B: end_date 10/05/19
What I need it to do is check the events when reading back the item list, see that Item A's expiry date is after the next event.end_date for event A, but is before the event.end_date for event B, so set a flag for using conditional formatting on the template's expiry date display.
Eventually, I suppose, the wish list is to also be able to say for every item "what's the latest event I can renew this item before it expires if there's an event in the list after its expiry time."

I could not completely understand your requiremets from your description, but you can use subqueries in Django as well. If you filter like this:
now = datetime.datetime.utcnow()
Item.objects.annotate(last_event_time=Subquery(Event.objects.filter(end_date__gt=now).values('end_date').order_by('-end_date')[:1]))
Each item in the resulting queryset will have last_event_time field, which would keep the latest event end_date field.
You can also use this field in further filtering, using F expressions:
Item.objects.annotate(last_event_time=Subquery(Event.objects.filter(end_date__gt=now).values('end_date').order_by('-end_date')[:1])).filter(expiry_date__lte=F('last_event_time'))

Related

How to find duplicate and deactivate duplicates for user attributes

Suppose we have a model in django defined as follows:
class DateClass:
user_id = models.IntegerField(...)
sp_date = models.DateField(...)
is_active = models.BooleanField(...)
...
I follow insert policy here, i.e, for a specific user there will be only one specific active date. That means, there will be only one active row for user=1 at date table for sp_date values 27/10/2021, 28/10/2021 and so one. There shouldn't be two active rows for 27/10/2021 for user=1, but for other users have there rows for 27/10/2021. Whenever a date has to be updated, I deactivate (is_active=False) the previous row and add a new row for specific date.
I want to find duplicate active dates for each users in one single query, and then deactivate (set is_active=False) all the duplicate values except the last row (The row which was last inserted). Two rows will be duplicate if the values of user_id and sp_date are equal and both have is_active=True. I know how to find duplicates for a specific column which is fairly easy. But I can't think of something which can do the above task elegantly. I can only think of following approach:
for user in users:
dates = DateClass(user_id=user.id, is_active=True)
for date in dates:
days = dates.filter(
sp_date=date.sp_date, is_active=True
)
if days.count() > 1:
last_day = days.last()
days.exclude(id=last_day.id).update(is_active=False)
As you can see above one is not that efficient, as I have to loop through all users. Is there any way to do this more efficiently? I am using PostgreSQL for database.
There a great answer for multiple duplicate fields queryset from this answer as i don't want to take the credit and also don't want to reinvent the wheel, so i will suggest that answer
For your case it should be:
from django.db.models import Max, Count
duplicate_date_class = DateClass.objects.values('user_id', 'sp_date') \
.annotate(records=Count('user_id')) \
.filter(records__gt=1)
# Then do operations on duplicates
for date_class in duplicate_date_class:
DateClass.objects.filter(
user_id=date_class['user_id'],
sp_date=date_class['sp_date']
)[1:].update(is_active=False)
If you want to avoid having duplicate set of multiple fields, i suggest taking a look at unique_together for model validation

Django create date list from existing objects

i come with this doubt abobt how to make a linked dates list based on existing objects, first of all i have a model with a DateTimeField which stores the date and hour that the object was added.
I have something like:
|pk|name|date
|1|name 1|2016-08-02 16:14:30.405305
|2|name 2|2016-08-02 16:15:30.405305
|3|name 3|2016-08-03 16:46:29.532976
|4|name 4|2016-08-04 16:46:29.532976
And i have some records with the same day but different hour, what i want is to make a list displaying only the unique days:
2016-08-02
2016-08-03
2016-08-04
And also because i'm using the CBV DayArchiveView i want to add a link to that elements to list them per day with a url pattern like this:
url(r'^archive/(?P<year>[0-9]{4})/(?P<month>[-\w]+)/(?P<day>[0-9]+)/$', ArticleDayArchiveView.as_view(), name="archive_day"),
The truth is that i don't have a clue of how to achieve that, can you help me with that?
Extracting unique dates
instances = YourModel.objects.all()
unique_dates = list(set(map(lambda x: x.date.strftime("%Y-%m-%d"), instances)))
About listing them, your url pattern looks ok. You need to define a view in order to retrieve them and wire up with that url.
UPDATE:
If you want to order them, just:
sorted_dates = sorted(unique_dates)

Django ORM: Select items where latest status is `success`

I have this model.
class Item(models.Model):
name=models.CharField(max_length=128)
An Item gets transferred several times. A transfer can be successful or not.
class TransferLog(models.Model):
item=models.ForeignKey(Item)
timestamp=models.DateTimeField()
success=models.BooleanField(default=False)
How can I query for all Items which latest TransferLog was successful?
With "latest" I mean ordered by timestamp.
TransferLog Table
Here is a data sample. Here item1 should not be included, since the last transfer was not successful:
ID|item_id|timestamp |success
---------------------------------------
1 | item1 |2014-11-15 12:00:00 | False
2 | item1 |2014-11-15 14:00:00 | True
3 | item1 |2014-11-15 16:00:00 | False
I know how to solve this with a loop in python, but I would like to do the query in the database.
An efficient trick is possible if timestamps in the log are increasing, that is the end of transfer is logged as timestamp (not the start of transfer) or if you can expect ar least that the older transfer has ended before a newer one started. Than you can use the TransferLog object with the highest id instead of with the highest timestamp.
from django.db.models import Max
qs = TransferLog.objects.filter(id__in=TransferLog.objects.values('item')
.annotate(max_id=Max('id')).values('max_id'), success=True)
It makes groups by item_id in the subquery and sends the highest id for every group to the main query, where it is filtered by success of the latest row in the group.
You can see that it is compiled to the optimal possible one query directly by Django.
Verified how compiled to SQL: print(qs.query.get_compiler('default').as_sql())
SELECT L.id, L.item_id, L.timestamp, L.success FROM app_transferlog L
WHERE L.success = true AND L.id IN
( SELECT MAX(U0.id) AS max_id FROM app_transferlog U0 GROUP BY U0.item_id )
(I edited the example result compiled SQL for better readability by replacing many "app_transferlog"."field" by a short alias L.field, by substituting the True parameter directly into SQL and by editing whitespace and parentheses.)
It can be improved by adding some example filter and by selecting the related Item in the same query:
kwargs = {} # e.g. filter: kwargs = {'timestamp__gte': ..., 'timestamp__lt':...}
qs = TransferLog.objects.filter(
id__in=TransferLog.objects.filter(**kwargs).values('item')
.annotate(max_id=Max('id')).values('max_id'),
success=True).select_related('item')
Verified how compiled to SQL: print(qs.query.get_compiler('default').as_sql()[0])
SELECT L.id, L.item_id, L.timestamp, L.success, I.id, I.name
FROM app_transferlog L INNER JOIN app_item I ON ( L.item_id = I.id )
WHERE L.success = %s AND L.id IN
( SELECT MAX(U0.id) AS max_id FROM app_transferlog U0
WHERE U0.timestamp >= %s AND U0.timestamp < %s
GROUP BY U0.item_id )
print(qs.query.get_compiler('default').as_sql()[1])
# result
(True, <timestamp_start>, <timestamp_end>)
Useful fields of latest TransferLog and the related Items are acquired by one query:
for logitem in qs:
item = logitem.item # the item is still cached in the logitem
...
The query can be more optimized according to circumstances, e.g. if you are not interested in the timestamp any more and you work with big data...
Without assumption of increasing timestamps it is really more complicated than a plain Django ORM. My solutions can be found here.
EDIT after it has been accepted:
An exact solution for a non increasing dataset is possible by two queries:
Get a set of id of the last failed transfers. (Used fail list, because it is much smaller small than the list of successful tranfers.)
Iterate over the list of all last transfers. Exclude items found in the list of failed transfers.
This way can be be efficiently simulated queries that would otherwise require a custom SQL:
SELECT a_boolean_field_or_expression,
rank() OVER (PARTITION BY parent_id ORDER BY the_maximized_field DESC)
FROM ...
WHERE rank = 1 GROUP BY parent_object_id
I'm thinking about implementing an aggregation function (e.g. Rank(maximized_field) ) as an extension for Django with PostgresQL, how it would be useful.
try this
# your query
items_with_good_translogs = Item.objects.filter(id__in=
(x.item.id for x in TransferLog.objects.filter(success=True))
since you said "How can I query for all Items which latest TransferLog was successful?", it is logically easy to follow if you start the query with Item model.
I used the Q Object which can be useful in places like this. (negation, or, ...)
(x.item.id for x in TransferLog.objects.filter(success=True)
gives a list of TransferLogs where success=True is true.
You will probably have an easier time approaching this from the ItemLog thusly:
dataset = ItemLog.objects.order_by('item','-timestamp').distinct('item')
Sadly that does not weed out the False items and I can't find a way to apply the filter AFTER the distinct. You can however filter it after the fact with python listcomprehension:
dataset = [d.item for d in dataset if d.success]
If you are doing this for logfiles within a given timeperiod it's best to filter that before ordering and distinct-ing:
dataset = ItemLog.objects.filter(
timestamp__gt=start,
timestamp__lt=end
).order_by(
'item','-timestamp'
).distinct('item')
If you can modify your models, I actually think you'll have an easier time using ManyToMany relationship instead of explicit ForeignKey -- Django has some built-in convenience methods that will make your querying easier. Docs on ManyToMany are here. I suggest the following model:
class TransferLog(models.Model):
item=models.ManyToManyField(Item)
timestamp=models.DateTimeField()
success=models.BooleanField(default=False)
Then you could do (I know, not a nice, single-line of code, but I'm trying to be explicit to be clearer):
results = []
for item in Item.objects.all():
if item.transferlog__set.all().order_by('-timestamp')[0].success:
results.append(item)
Then your results array will have all the items whose latest transfer was successful. I know, it's still a loop in Python...but perhaps a cleaner loop.

Aggregation and extra values with Django

I have a model which looks like this:
class MyModel(models.Model)
value = models.DecimalField()
date = models.DatetimeField()
I'm doing this request:
MyModel.objects.aggregate(Min("value"))
and I'm getting the expected result:
{"mymodel__min": the_actual_minimum_value}
However, I can't figure out a way to get at the same time the minimum value AND the associated date (the date at which the minimum value occured).
Does the Django ORM allow this, or do I have to use raw SQL ?
What you want to do is annotate the query, so that you get back your usual results but also have some data added to the result. So:
MyModel.objects.annotate(Min("value"))
Will return the normal result with mymodel__min as an additional value
In reply to your comment, I think this is what you are looking for? This will return the dates with their corresponding Min values.
MyModel.objects.values('date').annotate(Min("value"))
Edit: In further reply to your comment in that you want the lowest valued entry but also want the additional date field within your result, you could do something like so:
MyModel.objects.values('date').annotate(min_value=Min('value')).order_by('min_value')[0]
This will get the resulting dict you are asking for by ordering the results and then simply taking the first index which will always be the lowest value.
See more

Retrieving unique results in Django queryset based on column contents

I am not sure if the title makes any sense but here is the question.
Context: I want to keep track of which students enter and leave a classroom, so that at any given time I can know who is inside the classroom. I also want to keep track, for example, how many times a student has entered the classroom. This is a hypothetical example that is quite close to what I want to achieve.
I made a table Classroom and each entry has a Student (ForeignKey), Action (enter,leave), and Date.
My question is how to get the students that are currently inside (ie. their enter actions' date is later than their leave actions' date, or don't have a leave date), and how to specify a date range to get the students that were inside the classroom at that time.
Edit: On better thought I should also add that there are more than one classrooms.
my first attempt was something like this:
students_in = Classroom.objects.filter(classroom__exact=1, action__exact='1')
students_out = Classroom.objects.filter(classroom__exact=1, action__exact='0').values_list('student', flat=True)
students_now = students_in.exclude(student__in=students_out)
where if action == 1 is in, 0 is out.
This however provides the wrong data as soon as a student leaves a classroom and re-enters. She is listed twice in the students_now queryset, as there are two 'enters' and one 'leave'. Also, I can't check upon specific date ranges to see which students have an entry date that is later than their leave date.
To check a field based on the value of another field, use the F() operator.
from django.db.models import F
students_in_classroom_now = Student.objects.filter(leave__gte=F('enter'))
To get all students in the room at a certain time:
import datetime
start_time = datetime.datetime(2010, 1, 21, 10, 0, 0) # 10am yesterday
students_in_classroom_then = Student.objects.filter(enter__lte=start_time,
leave__gte=start_time)
Django gives you the Q() and F() operators, which are very powerful and enough for most of the situations. However I don't think that it will be enough for you. Let's think about your problem at the SQL level.
We have something like a table Classroom ( action, ts, student_id ). In order to know which students are at the classroom right now, we would have to make something like:
with ( /* temporary view with last user_action */
select action, max(ts) xts, student_id
from Classroom
group by action, student_id
) as uber_table
select a.student_id student_id
from uber_table a, uber_table b
where a.action = 'enter'
/* either he entered and never left */
and (a.student_id not in (select student_id from uber_table where action = 'leave')
/* or he left before he entered again, so he's still in */
or (a.student_id = b.student_id and b.action = 'leave' and b.xts < a.xts))
This is, I believe, standard SQL. However, if you're using SQLite or MySQL as database backends (most likely you are), then stuff like the WITH keyword for creating temporary views probably isn't supported and the query will just have to get even more complex. There may be a simpler version but I don't really see it.
My point here is that when you get to this level of complexity, F() and Q() become inadequate tools for the job, so I'd rather recommend that you write the SQL code by hand and use Raw SQL in Django.
Should you need to use the more common data access APIs, you should probably rewrite your data model in the way #Daniel Roseman implied.
By the way, a query for getting people that were inside the classroom in the same interval is just like that one, but all you have to do is limit the last leave ts to the beginning of the interval and the last enter ts to the end of the interval.