Django ORM queryset equivalent to group by year-month?

Django ORM queryset equivalent to group by year-month? - django

I have an Django app and need some datavisualization and I am blocked with ORM.
I have a models Orders with a field created_at and I want to present data with a diagram bar (number / year-month) in a dashboard template.
So I need to aggregate/annotate data from my model but did find a complete solution.
I find partial answer with TruncMonth and read about serializers but wonder if there is a simpliest solution with Django ORM possibilities...
In Postgresql it would be:
SELECT date_trunc('month',created_at), count(order_id) FROM "Orders" GROUP BY date_trunc('month',created_at) ORDER BY date_trunc('month',created_at);
"2021-01-01 00:00:00+01" "2"
"2021-02-01 00:00:00+01" "3"
"2021-03-01 00:00:00+01" "3"
...
example
1 "2021-01-04 07:42:03+01"
2 "2021-01-24 13:59:44+01"
3 "2021-02-06 03:29:11+01"
4 "2021-02-06 08:21:15+01"
5 "2021-02-13 10:38:36+01"
6 "2021-03-01 12:52:22+01"
7 "2021-03-06 08:04:28+01"
8 "2021-03-11 16:58:56+01"
9 "2022-03-25 21:40:10+01"
10 "2022-04-04 02:12:29+02"
11 "2022-04-13 08:24:23+02"
12 "2022-05-08 06:48:25+02"
13 "2022-05-19 15:40:12+02"
14 "2022-06-01 11:29:36+02"
15 "2022-06-05 02:15:05+02"
16 "2022-06-05 03:08:22+02"
expected result
[
{
"year-month": "2021-01",
"number" : 2
},
{
"year-month": "2021-03",
"number" : 3
},
{
"year-month": "2021-03",
"number" : 3
},
{
"year-month": "2021-03",
"number" : 1
},
{
"year-month": "2021-04",
"number" : 2
},
{
"year-month": "2021-05",
"number" : 3
},
{
"year-month": "2021-06",
"number" : 3
},
]
I have done this but I am not able to order by date:
Orders.objects.annotate(month=TruncMonth('created_at')).values('month').annotate(number=Count('order_id')).values('month', 'number').order_by()
<SafeDeleteQueryset [
{'month': datetime.datetime(2022, 3, 1, 0, 0, tzinfo=<UTC>), 'number': 4},
{'month': datetime.datetime(2022, 6, 1, 0, 0, tzinfo=<UTC>), 'number': 2},
{'month': datetime.datetime(2022, 5, 1, 0, 0, tzinfo=<UTC>), 'number': 1},
{'month': datetime.datetime(2022, 1, 1, 0, 0, tzinfo=<UTC>), 'number': 5},
{'month': datetime.datetime(2021, 12, 1, 0, 0, tzinfo=<UTC>), 'number': 1},
{'month': datetime.datetime(2022, 7, 1, 0, 0, tzinfo=<UTC>), 'number': 1},
{'month': datetime.datetime(2021, 9, 1, 0, 0, tzinfo=<UTC>), 'number': 2},
'...(remaining elements truncated)...'
]>

Try adding the order_by on the original field if you have multi-year data.
from django.db.models import Sum
from django.db.models.functions import TruncMonth
Orders.objects.values(month=TruncMonth('created_at')).
order_by("created_at").annotate(Sum('number')

Related

How add 0 when TruncWeek's week no result in Django Query?

I want query the issue's count of group by weekly.
query1 = MyModel.object.filter(issue_creator__in=group.user_set.all()).\
annotate(week=TruncWeek('issue_creat_date')).values('week').annotate(count=Count('id')).order_by('week'))
the query result is OK. the queryset result:
[
{'week': datetime.datetime(2022, 1, 3, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 9},
{'week': datetime.datetime(2022, 1, 10, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 12},
{'week': datetime.datetime(2022, 1, 17, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 10},
{'week': datetime.datetime(2022, 2, 7, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 1},
{'week': datetime.datetime(2022, 2, 14, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 6},
{'week': datetime.datetime(2022, 2, 21, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 11},
{'week': datetime.datetime(2022, 2, 28, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 1}
]
but 20220101-20220301 has 9 weeks:
[
datetime.datetime(2022, 1, 3, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 1, 10, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 1, 17, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 1, 24, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 1, 31, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 2, 7, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 2, 14, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 2, 21, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>),
datetime.datetime(2022, 2, 28, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>)
]
I want add zero when that week no result as this result:
[
{'week': datetime.datetime(2022, 1, 3, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 9},
{'week': datetime.datetime(2022, 1, 10, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 12},
{'week': datetime.datetime(2022, 1, 17, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 10},
{'week': datetime.datetime(2022, 1, 24, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 0},
{'week': datetime.datetime(2022, 1, 31, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 0},
{'week': datetime.datetime(2022, 2, 7, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 1},
{'week': datetime.datetime(2022, 2, 14, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 6},
{'week': datetime.datetime(2022, 2, 21, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 11},
{'week': datetime.datetime(2022, 2, 28, 0, 0, tzinfo=<DstTzInfo 'Europe/Stockholm' CEST+2:00:00 DST>), 'count': 1}
]
how to write the right queryset?
Thanks.

Django method for None value Coalesce.
from django.db.models.functions import Coalesce
query1 = MyModel.object.filter(issue_creator__in=group.user_set.all()).\
annotate(week=TruncWeek('issue_creat_date')).values('week').annotate(count=Count('id')).order_by('week'))

Merging a list of dictionaries

I have a list of dicts as below.
list = [ {id: 1, s_id:2, class: 'a', teacher: 'b'} ]
list1 = [ {id: 1, c_id:1, rank:2, area: 34}, {id:1, c_id:2, rank:1, area: 21} ]
I want to merge the two lists on the common key-value pairs (in this case 'id:1')
Merged_list = [ {id:1, s_id:2, class: 'a', teacher: 'b', list1: {c_id:1, rank: 2, area: 34}, {c_id:2, rank: 1, area: 21} ]
How do I go about this?
Thanks

You can use
merged_list = [{**d1, **d2} for d1, d2 in zip(list1, list2)]
>>> merged_list
[{'id': 1, 's_id': 2, 'class': 'a', 'teacher': 'b', 'rank': 2, 'area': 34},
{'id': 2, 's_id': 3, 'class': 'c', 'teacher': 'd', 'rank': 1, 'area': 21}]
where {**d1, **d2} is just a neat way to combine 2 dictionaries. Keep in mind this will replace the duplicate keys of the first dictionary. If you're on Python 3.9, you could use d1 | d2.
EDIT: For the edit in your question, you can try this horrible one liner (keep in mind this will create the pair list1: [] if no matching indeces were found on list1):
list_ = [{"id": 1, "s_id": 2, "class": 'a', "teacher": 'b'}]
list1 = [{"id": 1, "c_id": 1, "rank": 2, "area": 34}, {"id": 1, "c_id": 2, "rank": 1, "area": 21}]
merged_list = [{**d, **{"list1": [{k: v for k, v in d1.items() if k != "id"} for d1 in list1 if d1["id"] == d["id"]]}} for d in list_]
>>> merged_list
[{'id': 1,
's_id': 2,
'class': 'a',
'teacher': 'b',
'list1': [{'c_id': 1, 'rank': 2, 'area': 34},
{'c_id': 2, 'rank': 1, 'area': 21}]}]
This is equivalent to (with some added benefits):
merged_list = []
for d in list_:
matched_ids = []
for d1 in list1:
if d["id"] == d1["id"]:
d1.pop("id") # remove id from dictionary before appending
matched_ids.append(d1)
if matched_ids: # added benefit of not showing the 'list1' key if matched_ids is empty
found = {"list1": matched_ids}
else:
found = {}
merged_list.append({**d, **found})

Try this
And don't forget to put " " when declare string
list = [ {"id": 1, "s_id": 2 ," class": 'a', "teacher": 'b'}, {"id": 2, "s_id" : 3, "class" : 'c', "teacher": 'd'} ]
list1 = [ {"id": 1, "rank" :2, "area" : 34}, {"id" :2, "rank" :1, "area": 21} ]
list2 = list1 + list
print(list2)

How to exclude items with identical field if the datefield is bigger than in others duplicates?

So I have a Comments model and by querying
comments = Comments.objects.values('students_id', 'created_at')
I get this output
<QuerySet [
{'students_id': 4, 'created_at': datetime.date(2019, 6, 19)}, {'students_id': 2, 'created_at': datetime.date(2019, 6, 3)}, {'students_id': 1, 'created_at': datetime.date(2019, 6, 24)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 4)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 19)}, {'students_id': 5, 'created_at': datetime.date(2019, 6, 5)}, {'students_id': 4, 'created_at': datetime.date(2019, 7, 28)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 11)}]>
It's three comments by student with id=6 and two comments by student with id=4.
What I need to get is only one latest comment from every student. In this example it'll look like this:
<QuerySet [
{'students_id': 2, 'created_at': datetime.date(2019, 6, 3)}, {'students_id': 1, 'created_at': datetime.date(2019, 6, 24)}, {'students_id': 6, 'created_at': datetime.date(2019, 6, 19)}, {'students_id': 5, 'created_at': datetime.date(2019, 6, 5)}, {'students_id': 4, 'created_at': datetime.date(2019, 7, 28)},]>
Thanks in advance for the answer!

You can use annotate and max to get desired result like this Comments.objects.values('students_id').annotate(Max('created_at'))
and the output will be like this <QuerySet [
{'students_id': 2, 'created_at__max': datetime.date(2019, 6, 3)}, {'students_id': 1, 'created_at__max': datetime.date(2019, 6, 24)},]> which will have students_id and latest created_at. To use this you have to import Max from django.db.models like this from django.db.models import Max

use this code :
queryset=Comments.objects.values('students_id', 'created_at').group_by('students_id').annotate(Latest_created_at=Max('created_at'))
queryset.delete()

In raw SQL it would be ... WHERE NOT EXISTS(SELECT * FROM Comments cc WHERE cc.student_id = c.student_id AND cc.created_at > c.created_at)
later_comments = Comments.objects.filter(student_id=OuterRef('student_id'),
created_at__gt=OuterRef('created_at'), ).values('created_at', )
latest_comments = Comments.objects.\
annotate(has_later_comments=Exists(later_comments), ).\
filter(has_later_comments=False, )
If your created_at is a Date column (no time), then you need to use => instead of > because perhaps more than one comment can be created during a day. So the query would contain additional predicate with extra column for ordering comments (like id): WHERE cc.created_at > c.created_at OR cc.created_at = c.created_at AND cc.id > c.id
https://docs.djangoproject.com/en/2.2/ref/models/expressions/#exists-subqueries

Access Pandas MultiIndex column by name

I have a spreadsheet imported with pandas like this:
df = pd.read_excel('my_spreadsheet.xlsx',header = [0,1],index_col=0,sheetname='Sheet1')
The output of df.columns is:
MultiIndex(levels=[[u'MR 1', u'MR 10', u'MR 11', u'MR 12', u'MR 13', u'MR 14', u'MR 15', u'MR 16', u'MR 17', u'MR 18', u'MR 19', u'MR 2', u'MR 20', u'MR 21', u'MR 22', u'MR 3', u'MR 4', u'MR 5', u'MR 6', u'MR 7', u'MR 8', u'MR 9'], [u'BIRADS', u'ExamDesc', u'completedDTTM']],
labels=[[0, 0, 0, 11, 11, 11, 15, 15, 15, 16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 21, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 12, 12, 12, 13, 13, 13, 14, 14, 14], [1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0]],
names=[None, u'De-Identified MRN'])
I have been trying to access the values of column named 'De-Identified MRN', but can't seem to find the way to do this.
What I have tried (based on similar posts):
[in] df.index.get_level_values('De-Identified MRN')
[out] KeyError: 'Level De-Identified MRN must be same as name (None)'
and
[in] df.index.unique(level='De-Identified MRN')
[out] KeyError: 'Level De-Identified MRN must be same as name (None)'
UPDATE:
The following did the trick for some reason. I really do not understand the format of the MultiIndex Pandas Dataframe:
pd.Series(df.index)

By using your data
s="MultiIndex(levels=[[u'MR 1', u'MR 10', u'MR 11', u'MR 12', u'MR 13', u'MR 14', u'MR 15', u'MR 16', u'MR 17', u'MR 18', u'MR 19', u'MR 2', u'MR 20', u'MR 21', u'MR 22', u'MR 3', u'MR 4', u'MR 5', u'MR 6', u'MR 7', u'MR 8', u'MR 9'], [u'BIRADS', u'ExamDesc', u'completedDTTM']],labels=[[0, 0, 0, 11, 11, 11, 15, 15, 15, 16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 21, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 12, 12, 12, 13, 13, 13, 14, 14, 14], [1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0]],names=[None, u'De-Identified MRN'])"
idx=eval(s, {}, {'MultiIndex': pd.MultiIndex})
df=pd.DataFrame(index=idx)
df.index.get_level_values(level=1) # df.index.get_level_values('De-Identified MRN')
Out[336]:
Index(['ExamDesc', 'completedDTTM', 'BIRADS', 'ExamDesc', 'completedDTTM',
'BIRADS', 'ExamDesc', 'completedDTTM', 'BIRADS', 'ExamDesc',...
Also if all above still does not work , try
df.reset_index()['De-Identified MRN']

Try the following:
midx = pd.MultiIndex(
levels=[[u'MR 1', u'MR 10', u'MR 11', u'MR 12', u'MR 13', u'MR 14', u'MR 15', u'MR 16', u'MR 17', u'MR 18', u'MR 19', u'MR 2', u'MR 20', u'MR 21', u'MR 22', u'MR 3', u'MR 4', u'MR 5', u'MR 6', u'MR 7', u'MR 8', u'MR 9'], [u'BIRADS', u'ExamDesc', u'completedDTTM']],
labels=[[0, 0, 0, 11, 11, 11, 15, 15, 15, 16, 16, 16, 17, 17, 17, 18, 18, 18, 19, 19, 19, 20, 20, 20, 21, 21, 21, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 12, 12, 12, 13, 13, 13, 14, 14, 14], [1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2, 0]],
names=[None, u'De-Identified MRN']
)
midx.levels[1] # returns the following
Index(['BIRADS', 'ExamDesc', 'completedDTTM'], dtype='object', name='De-Identified MRN')
midx.levels[1].values # returns the following
array(['BIRADS', 'ExamDesc', 'completedDTTM'], dtype=object)

Django count number of records per day

I'm using Django 2.0
I am preparing data to show on a graph in template. I want to fetch number of records per day.
This is what I'm doing
qs = self.get_queryset().\
extra({'date_created': "date(created)"}).\
values('date_created').\
annotate(item_count=Count('id'))
but, the output given is
[
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1},
{'date_created': datetime.date(2018, 5, 24), 'item_count': 1}
]
Here data is not grouped and same date is returning repeatedly with count as 1

Try using TruncDate function.
See that answer

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Django ORM queryset equivalent to group by year-month? - django

Try adding the order_by on the original field if you have multi-year data. from django.db.models import Sum from django.db.models.functions import TruncMonth Orders.objects.values(month=TruncMonth('created_at')). order_by("created_at").annotate(Sum('number')

Related

How add 0 when TruncWeek's week no result in Django Query?

Merging a list of dictionaries

How to exclude items with identical field if the datefield is bigger than in others duplicates?

Access Pandas MultiIndex column by name

Django count number of records per day

Categories

Resources