I have model containing "caller_name" and "call_datetime" field.
I was able to get number of calls occurred on each day in particular month:
while start_date <= end_date:
calls = CDR.objects.filter(start_time__year=str(start_date.year), start_time__month=str(start_date.month),
start_time__day=str(start_date.day))
print "Number of calls:", len(calls)
start_date = start_date + datetime.timedelta(days=1)
Similarlly, I tried to get number of calls on each hour in particular date.
for i in range(24):
calls = CDR.objects.filter(start_time__year=str(start_date.year), start_time__month=str(start_date.month),start_time__day=str(start_date.day), start_time__hour=str(i))
Found out that "start_time__hour" is not implemented, but in their any way to achieve this?
Try this workaround:
day_calls = CDR.objects.filter(start_time__year=str(start_date.year), start_time__month=str(start_date.month),start_time__day=str(start_date.day))
hour_calls = day_calls.extra(select={'hours': 'DATE_FORMAT(start_date, "%%H")'})\
.values_list('hours', flat=True)\
.distinct()\
.order_by('hours')
You could either use raw SQL with the .extra() method or something like this:
for i in range(24):
dt1 = start_time.replace(hour=i)
dt2 = dt1 + datetime.timedelta(hours=1)
calls = CDR.objects.filter(start_time__gte=dt1, start_time__lt=dt2)
Related
I have a table for ManyToMany relationship. Where each Tutor need to input multiple days that he wants to tutor student. Like:
Availability Tutor:
user available_day time
t1 Sun 6,7,8
t2 mon 3,4
t1 mon 1,2
I would like to get all tutor where availablity is based on search day.
Model:
class TutorProfile(models.Model):
user = models.ForeignKey(
settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
tutor_availablility = models.ManyToManyField(
Day)
queryset:
def get_queryset(self):
subject_param = self.request.GET.get('subject')
bookday = self.request.GET.get('book_day')
grade = self.request.GET.get('grade')
li = list(bookday.split(","))
print("Book Info:", li)
for i in range(len(li)):
print(li[i])
_day_name = datetime.strptime(li[i], "%d-%m-%Y").strftime("%A")
print("Day Name:", _day_name)
day_num = Day.objects.get(day_name=_day_name)
print("Day_Num",day_num)
result = TutorProfile.objects.filter(
tutor_availablility=day_num).all()
return result
When I run this query it only loop over one time for one day but not for multiple days.
Now how may I run this query so that it will not stop after one loop.
After One loop it says: An exception occurred
An exception occurred
I tried to apply try catch block here. But getting no clue. Why stop loop after one cycle?
After one loop, you are returning the results. Instead, use the Q object to store the conditions and then filter the queryset outside of the for loop:
from django.db.models import Q
...
query = Q()
for i in range(len(li)):
_day_name = datetime.strptime(li[i], "%d-%m-%Y").strftime("%A")
day_num = Day.objects.get(day_name=_day_name)
query2 = query | Q(
tutor_availablility=day_num)
return TutorProfile.objects.filter(query)
Update
Although the following code should not change the outcome, but try like this:
li = list(bookday.split(","))
query = list(map(lambda x: datetime.strptime(x, "%d-%m-%Y").strftime("%A"), li)
return TutorProfile.objects.filter(tutor_availability__day_name__in = query)
I have a model that looks something like that:
class Payment(TimeStampModel):
timestamp = models.DateTimeField(auto_now_add=True)
amount = models.FloatField()
creator = models.ForeignKey(to='Payer')
What is the correct way to calculate average spending per day?
I can aggregate by day, but then the days when a payer does not spend anything won't count, which is not correct
UPDATE:
So, let's say I have only two records in my db, one from March 1, and one from January 1. The average spending per day should be something
(Sum of all spendings) / (March 1 - January 1)
that is divided by 60
however this of course give me just an average spending per item, and number of days will give me 2:
for p in Payment.objects.all():
print(p.timestamp, p.amount)
p = Payment.objects.all().dates('timestamp','day').aggregate(Sum('amount'), Avg('amount'))
print(p
Output:
2019-03-05 17:33:06.490560+00:00 456.0
2019-01-05 17:33:06.476395+00:00 123.0
{'amount__sum': 579.0, 'amount__avg': 289.5}
You can aggregate min and max timestamp and the sum of amount:
from django.db.models import Min, Max, Sum
def average_spending_per_day():
aggregate = Payment.objects.aggregate(Min('timestamp'), Max('timestamp'), Sum('amount'))
min_datetime = aggregate.get('timestamp__min')
if min_datetime is not None:
min_date = min_datetime.date()
max_date = aggregate.get('timestamp__max').date()
total_amount = aggregate.get('amount__sum')
days = (max_date - min_date).days + 1
return total_amount / days
return 0
If there is a min_datetime then there is some data in the db table, and there is also max date and total amount, otherwise we return 0 or whatever you want.
It depends on your backend, but you want to divide the sum of amount by the difference in days between your max and min timestamp. In Postgres, you can simply subtract two dates to get the number of days between them. With MySQL there is a function called DateDiff that takes two dates and returns the number of days between them.
class Date(Func):
function = 'DATE'
class MySQLDateDiff(Func):
function = 'DATEDIFF'
def __init__(self, *expressions, **extra):
expressions = [Date(exp) for exp in expressions]
extra['output_field'] = extra.get('output_field', IntegerField())
super().__init__(*expressions, **extra)
class PgDateDiff(Func):
template = "%(expressions)s"
arg_joiner = ' - '
def __init__(self, *expressions, **extra):
expressions = [Date(exp) for exp in expressions]
extra['output_field'] = extra.get('output_field', IntegerField())
super().__init__(*expressions, **extra)
agg = {
avg_spend: ExpressionWrapper(
Sum('amount') / (PgDateDiff(Max('timestamp'), Min('timestamp')) + Value(1)),
output_field=DecimalField())
}
avg_spend = Payment.objects.aggregate(**agg)
That looks roughly right to me, of course, I haven't tested it. Of course, use MySQLDateDiff if that's your backend.
Here is my PostgreSQL statement.
select round(sum("amount") filter(where "date">=now()-interval '12 months')/12,0) as avg_12month from "amountTab"
How to use this in Django?
I have an object called 'Devc', with attribute 'date'.
I want to get the sum of the specific data within past 12 months, not past 365 days.
You can try this to get the data within the past 12 months.
today= datetime.now()
current_month_first_day = today.replace(day = 1)
previous_month_last_day = current_month_first_day - timedelta(days = 1)
past_12_month_first_day = previous_month_last_day - timedelta(days = 360)
past_12_month_first_day = past_12_month_first_day.replace(day = 1)
past_12_month_avg = Devc.objects.filter(date__range=(past_12_month_first_day,current_month_first_day)).aggregate(Sum('amount'))['amount']
I am working with Django to see how to handle large databases. I use a database with fields name, age, date of birth(dob) and height. The database has about 500000 entries. I have to find the average height of persons of (1) same age and (2) born in same year. The aggregate function in querying table takes about 10s. Is it usual or am I missing something?
For age:
age = [i[0] for i in Data.objects.values_list('age').distinct()]
ht = []
for each in age:
aggr = Data.objects.filter(age=each).aggregate(ag_ht=Avg('height')
ht.append(aggr)
From dob,
age = [i[0].year for i in Data.objects.values_list('dob').distinct()]
for each in age:
aggr = Data.objects.filter(dob__contains=each).aggregate(ag_ht=Avg('height')
ht.append(aggr)
The year has to be extracted from dob. It is SQLite and I cannot use __year (join).
For these queries to be efficient, you have to create indexes on the age and dob columns.
You will get a small additional speedup by using covering indexes, i.e., using two-column indexes that also include the height column.
full version with time compare loop and query set version
import time
from dd.models import Data
from django.db.models import Avg
from django.db.models.functions import ExtractYear
for age
start = time.time()
age = [i[0] for i in Data.objects.values_list('age').distinct()]
ht = []
for each in age:
aggr = Data.objects.filter(age=each).aggregate(ag_ht=Avg('height'))
ht.append(aggr)
end = time.time()
loop_time = end - start
start = time.time()
qs = Data.objects.values('age').annotate(ag_ht=Avg('height')).order_by('age')
ht_qs = qs.values_list('age', 'ag_ht')
end = time.time()
qs_time = end - start
print loop_time / qs_time
for dob year, with easy refactoring your version(add set in the years)
start = time.time()
years = set([i[0].year for i in Data.objects.values_list('dob').distinct()])
ht_year_loop = []
for each in years:
aggr = Data.objects.filter(dob__contains=each).aggregate(ag_ht=Avg('height'))
ht_year_loop.append((each, aggr.get('ag_ht')))
end = time.time()
loop_time = end - start
start = time.time()
qs = Data.objects.annotate(dob_year=ExtractYear('dob')).values('dob_year').annotate(ag_ht=Avg('height'))
ht_qs = qs.values_list('dob_year', 'ag_ht')
end = time.time()
qs_time = end - start
print loop_time / qs_time
Let's assume that I have modeL;
class MyModel(...):
start = models.DateTimeField()
stop = models.DateTimeField(null=True, blank=True)
And I have also two records:
start=2012-01-01 7:00:00 stop=2012-01-01 14:00:00
start=2012-01-01 7:00:03 stop=2012-01-01 23:59:59
Now I want to find the second query, so start datetime should be between start and stop, and stop should have hour 23:59:59. How to bould such query?
Some more info:
I think this requires F object. I want to find all records where start -> time is between another start -> time and stop -> time, and stop -> time is 23:59:59, and date is the same like in start
YOu can use range and extra:
from django.db.models import Q
q1 = Q( start__range=(start_date_1, end_date_1) )
q1 = Q( start__range=(start_date_2, end_date_2) )
query = (''' EXTRACT(hour from end_date) = %i
and EXTRACT(minute from end_date) = %i
and EXTRACT(second from end_date) = %i''' %
(23, 59,59)
)
MyModel.objects.filter( q1 | q2).extra(where=[query])
Notice: Posted before hard answer requirement changed 'time is 23:59:59, and date is the same like in start'
To perform the query: "start datetime should be between start and stop"
MyModel.objects.filter(start__gte=obj1.start, start__lte=obj1.stop)
I don't quite understand your second condition, though. Do you want it to match only objects with hour 23:59:59, but for any day?
dt = '2012-01-01 8:00:00'
stop_hour = '23'
stop_minute = '59'
stop_sec = '59'
where = 'HOUR(stop) = %(hour)s AND MINUTE(stop) = %(minute)s AND SECOND(stop) = %(second)s' \
% {'hour': stop_hour, 'minute': stop_minute, 'seconds': stop_ec}
objects = MyModel.objects.filter(start__gte=dt, stop__lte=dt) \
.extra(where=[where])