Using Django ORM to retrieve recent rows - django

In SQL, if I wanted to query a table for data from the most recent 10 minutes (regardless of timezones and such), I'd simply do (using postgresql parlance):
select * from table where creation_time > now() - interval'10 mins';
Is there an equivalent way to do something like this using the Django ORM, disregarding what timezone settings one has set for the app? Would be great to get an illustrative example here.

Try this:-
Data within 10 minutes :-
from datetime import datetime, timedelta
time_threshold = datetime.now() - timedelta(minutes=10)
results = Table.objects.filter(createdOn__lte=time_threshold)
Last 10 rows based on createdOn value:-
recentData = Table.objects.all().order_by('-createdOn')[:10]
Last 10 rows if you don't have createdOn column to filter:-
recentData = Table.objects.all().order_by('-id')[:10]

Related

Sum greatest values of each day from period with Django Query

My proj has a model that goes like:
class Data(Model):
data = FloatField(verbose_name='Data', null=True, blank=True)
created_at = DateTimeField(verbose_name='Created at')
And my app creates a few hundred logs of this model per day.
I'm trying to sum only the greatest values of each day, without having to iterate over them (using only Django queries).
Is it possible without writing SQL queries?
PS: I'm able to get the greatest 'data' of each day, so the current logic iterates over days and sums the greatest values of each day. But that solution is becoming too slow and I'd like to solve it directly into db level.
Annotations and aggregates to the rescue:
from django.db.models import Sum, Max
from django.db.models.functions import Trunc
report = (Data.objects
.annotate(day=Trunc('created_at', 'day'))
.values('day')
.annotate(greatest=Max('data'))
.values('greatest')
.aggregate(total=Sum('greatest'))
)
print(report['total'])
The resulting SQL is almost simpler than the code:
SELECT SUM("greatest")
FROM
(SELECT MAX("app_data"."data_id") AS "greatest"
FROM "app_data"
GROUP BY DATE_TRUNC('day', "app_data"."created_at")) subquery
If you are using a database backed that supports distinct on fields (like postgres does) you can do.
Data.objects.order_by('created_at__date', '-data').distinct('created_at__date')

Comparing unix timestamp with date in Django ORM

I am trying to fetch all records from a table on a particular date.
My url.py code is :
url(r'^jobs/date/?P<date>.*',RunningJobsListApiView.as_view()),
Here is the code of my view to get all the records from the table.
class RunningJobsListApiView(generics.ListAPIView):
queryset = LinuxJobTable.objects.annotate(status=Case(When(state=3, then=Value('Completed')),When(state=5, then=Value('Failed')),When(state=1, then=Value('Running')),default=Value('Unknown'),output_field=CharField(),),)
serializer_class = JobInfoSerializer
Now, I want to filter the jobs for the particular date in url. But In my database date is in UNIX timestamp format(ie.1530773247).
How can I compare DateFormat(mm-dd-yyyy) with UNIX timestamp format saved in DB?
To get a UNIX timestamp from a string date representation, you first need to convert the string to a Python datetime with strptime(), and then call the timestamp() method on it. But since a single day comprises a range of timestamps, you need to do a range query between the start of the target day and the start of the next day.
Something like:
from datetime import datetime, timedelta
target_day = datetime.strptime(date, "%m-%d-%Y")
next_day = target_day + timedelta(days=1)
queryset = LinuxJobTable.objects.filter(timestamp__range=(
int(target_day.timestamp()),
int(next_day.timestamp()) - 1 # since range is inclusive
))

Filter two dates in one query django/python

Is there a way I can filter two datefield columns from a table on the same query?
Example:
I have date_ots and date_lta I need to filter and get the result from today + 4 days on both columns.
Thanks for your attention,
Alex
If you would like exact timing (hours, min, sec), you can do:
from datetime import timedelta
four_days_from_now = timezone.now() + timedelta(days=4)
query = Model.objects.filter(date_ots=four_days_from_now, date_lta=four_days_from_now)
If you only want the date 4 days from now (at any time), you can do:
from datetime import timedelta
four_days_from_now = timezone.now().date() + timedelta(days=4)
query = Model.objects.filter(date_ots=four_days_from_now, date_lta=four_days_from_now)

Django queryset aggregate by time interval

Hi I am writing a Django view which ouputs data for graphing on the client side (High Charts). The data is climate data with a given parameter recorded once per day.
My query is this:
format = '%Y-%m-%d'
sd = datetime.datetime.strptime(startdate, format)
ed = datetime.datetime.strptime(enddate, format)
data = Climate.objects.filter(recorded_on__range = (sd, ed)).order_by('recorded_on')
Now, as the range is increased the dataset obviously gets larger and this does not present well on the graph (aside from slowing things down considerably).
Is there an way to group my data as averages in time periods - specifically average for each month or average for each year?
I realize this could be done in SQL as mentioned here: django aggregation to lower resolution using grouping by a date range
But I would like to know if there is a handy way in Django itself.
Or is it perhaps better to modify the db directly and use a script to populate month and year fields from the timestamp?
Any help much appreciated.
Have you tried using django-qsstats-magic (https://github.com/kmike/django-qsstats-magic)?
It makes things very easy for charting, here is a timeseries example from their docs:
from django.contrib.auth.models import User
import datetime, qsstats
qs = User.objects.all()
qss = qsstats.QuerySetStats(qs, 'date_joined')
today = datetime.date.today()
seven_days_ago = today - datetime.timedelta(days=7)
time_series = qss.time_series(seven_days_ago, today)
print 'New users in the last 7 days: %s' % [t[1] for t in time_series]

Aggregate difference between DateTime fields in Django

I have a table containing a series of entries which relate to time periods (specifically, time worked for a client):
task_time:
id | start_time | end_time | client (fk)
1 08/12/2011 14:48 08/12/2011 14:50 2
I am trying to aggregate all the time worked for a given client, from my Django app:
time_worked_aggregate = models.TaskTime.objects.\
filter(client = some_client_id).\
extra(select = {'elapsed': 'SUM(task_time.end_time - task_time.start_time)'}).\
values('elapsed')
if len(time_worked_aggregate) > 0:
time_worked = time_worked_aggregate[0]['elapsed'].total_seconds()
else:
time_worked = 0
This seems inelegant, but it does work. Or at least so I thought: it turns out that it works fine on a PostgreSQL database, but when I move over to SQLite, everything dies.
A bit of digging suggests that the reason for this is that DateTimes aren't first-class data in SQLite. The following raw SQLite query will do my job:
SELECT SUM(strftime('%s', end_time) - strftime('%s', start_time)) FROM task_time WHERE ...;
My question is as follows:
The Python sample above seems roundabout. Can we do this more elegantly?
More importantly at this stage, can we do it in a way that will work on both Postgres and SQLite? Ideally, I'd like not to be writing raw SQL queries and switching on the database backend that happens to be in place; in general, Django is extremely good at protecting us from this. Does Django have a reasonable abstraction for this operation? If not, what's a sensible way for me to do a conditional switch on the backend?
I should mention for context that the dataset is many thousands of entries; the following is not really practical:
sum([task_time.end_date - task_time.start_date for task_time in models.TaskTime.objects.filter(...)])
Almost the same solution as #andri proposed. In the final result you will get the same data.
ExpressionWrapper - New in Django 1.8.
from datetime import timedelta
from django.db.models import ExpressionWrapper, F, fields
from app.models import MyModel
duration = ExpressionWrapper(F('closed_at') - F('opened_at'), output_field=fields.DurationField())
objects = MyModel.objects.closed().annotate(duration=duration).filter(duration__gt=timedelta(seconds=2))
for obj in objects:
print obj.id, obj.duration, obj.duration.seconds
# sample output
# 807 0:00:57.114017 57
# 800 0:01:23.879478 83
# 804 3:40:06.797188 13206
# 801 0:02:06.786300 126
I think since Django 1.8 we can do better:
I would like just to draw the part with annotation, the further part with aggregation should be straightforward:
from django.db.models import F, Func
SomeModel.objects.annotate(
duration = Func(F('end_date'), F('start_date'), function='age')
)
[more about postgres age function here: http://www.postgresql.org/docs/8.4/static/functions-datetime.html ]
each instance of SomeModel will be anotated with duration field containg time difference, which in python will be a datetime.timedelta() object [more about datetime timedelta here: https://docs.python.org/2/library/datetime.html#timedelta-objects ]
I will do it step by step:
first step:annotate the timedelta
group by and sum timedelta
the code like this:
from django.db.models import Count, Sum, F
times_obj_list = models.TaskTime.objects.annotate(times=F("end_time")-F("start_time"))
groupby_obj_list = times_obj_list.values("client").annotate(cnt=Count("id"),seconds=Sum(times)).order_by()
Django currently only supports aggregates for Min, Max, Avg and Count, so using raw SQL is the only way to achieve what you want. When you use raw SQL, database-independence is out the window, so unfortunately, you're out of luck. You'll have to just detect the database and alter the SQL appropriately.