I have following models:
class CloudObjects(models.Model):
object_id = models.AutoField(primary_key=True)
object_name = models.CharField(max_length=256)
creation_time = models.DateTimeField()
removed_date = models.DateTimeField(blank=True, null=True)
item = models.ManyToManyField(BackupItems, db_table='cloud_object_items')
class BackupItems(models.Model):
name = models.CharField(max_length=100)
I'd like to annotate for each BackupItem, the most recent creation_time field from CloudObject for items which are planned to be removed in removed_date in future.
As an example:
CloudObject looks likes this.
object_id | object_name | creation_time | removed_date | item
1 | object_one_in_cloud | 2021-01-01 | 2021-10-01 | 1
2 | object_two_in_cloud | 2021-02-02 | 2099-12-31 | 1
3 | object_three_in_cloud | 2021-03-03 | 2099-12-31 | 1
4 | object_four_in_cloud | 2021-12-31 | 2022-01-01 | 1
For above example I'd like to annotate with item 3, as it has removed_date in the future and this is the most fresh item (item 2 is also planned to be removed in future, but 3 is more recent)
Now in my Views I'd like to annotate this. I tried different ways, but can move forward now.
This is last one I tried:
from django.db.models import Subquery, OuterRef
class BackupListView(ListView):
template_name = 'somefile.html'
def get_queryset(self):
last_item = CloudObjects.objects.filter(item=OuterRef("pk")).filter(removed_date__gte=timezone.now()).last()
all_items = BackupItems.objects.annotate(last_backup=Subquery(last_item.get('creation_time')))
return all_items
How to get it working?
As described in the docs:
(Using get() instead of a slice would fail because the OuterRef cannot be resolved until the queryset is used within a Subquery.)
last behaves similarly with get where it tries to resolve the queryset but OuterRef needs to be in a subquery first. That's why it wouldn't work. So slicing should be used instead like so:
def get_queryset(self):
cloud_objects = CloudObjects.objects.filter(
item=OuterRef("pk")
).filter(
removed_date__gte=timezone.now()
).order_by(
"-creation_time"
)
all_items = BackupItems.objects.annotate(
last_backup_date=Subquery(
cloud_objects.values("creation_time")[:1]
)
)
return all_items
I am using the simple history library for my Django project. It's pretty nifty, but I'm having trouble showing aggregated history stats next to a base model object.
Here's what my model looks like:
from django.db import models
from django.contrib.auth.models import User
from simple_history.models import HistoricalRecords
class RepairForm(models.Model):
user_id = models.ForeignKey(User, on_delete=models.DO_NOTHING,)
return_number = models.CharField(max_length=200, unique=True)
status_id = models.ForeignKey(RFormStatus, on_delete=models.DO_NOTHING)
...
history = HistoricalRecords()
def __str__(self):
return self.return_number
The Docs lead me to believe the proper way of accessing historical records is using the history manager. I can get both sets of information I want:
All Forms (base model objects) -
RepairForm.objects.all()
User ID | Return Number | Status ID
-----------------------------------------------------------
33 | 0a6e6ef0-a444-4b63-bd93-ae55fe8a3cee | 65001
44 | 5f699795-5119-4dcd-8b94-34f7056e732c | 65002
...
A history calculation (history object)
In this example I am getting the latest event of each form -
RepairForm.history.all()\
.values('return_number').annotate(latest_event_date=Max('history_date'))\
.order_by('return_number')
Return Number | latest_event_date
-----------------------------------------------------------
0a6e6ef0-a444-4b63-bd93-ae55fe8a3cee | 7/27/2018
5f699795-5119-4dcd-8b94-34f7056e732c | 8/1/2018
...
I feel like this should be possible to do in one query though no? One query that outputs something like this:
User ID | Return Number | Status ID | latest_event_date
------------------------------------------------------------------------------
33 | 0a6e6ef0-a444-4b63-bd93-ae55fe8a3cee | 65001 | 7/27/2018
44 | 5f699795-5119-4dcd-8b94-34f7056e732c | 65002 | 8/1/2018
...
you can add a property for example "latest_event_date" that calculates the wanted result.then it is always calculated when you run a query on RepairForm!
#property
def latest_event_date(self):
....
I am trying to convert my below mention SQL query to Django ORM layer query but I was not able to get the perfect output as provided by the SQL statement.
Models
class YearlyTable(models.Model):
class Meta:
db_table = 'yearlytable'
managed = True
user_id = models.IntegerField(db_index=True)
rotations = models.IntegerField()
calories = models.FloatField()
distance = models.FloatField()
duration = models.IntegerField(default=0)
year = models.IntegerField()
created = models.DateTimeField(auto_now_add=True)
modified = models.DateTimeField(auto_now=True)
class User(AbstractBaseUser):
class Meta:
db_table = 'users'
managed = True
email = models.EmailField(max_length=255, unique=True)
first_name = models.CharField(max_length=255, blank=True, null=True)
city = models.CharField(max_length=200, blank=True, null=True)
state = models.CharField(max_length=200, blank=True, null=True)
postal_code = models.IntegerField(blank=True, null=True)
country = models.CharField(max_length=200, blank=True, null=True)
SELECT
users.state,
sum(yearlytable.rotations) as sum_rotations,
sum(yearlytable.calories) as sum_calories,
sum(yearlytable.distance) as sum_distance
FROM yearlytable
INNER JOIN users on (yearlytable.user_id = users.id)
WHERE yearlytable.user_id in(SELECT id FROM users WHERE country LIKE 'United States%' and NOT ("email" LIKE '%yopmail.com%'))
GROUP BY users.state
Then I tried to execute the above-mentioned query using RAW Django Query Example:
User.objects.raw('select users.state,sum(yearlytable.rotations) as sum_rotations,sum(yearlytable.calories) as sum_calories,sum(yearlytable.distance) as sum_distance from yearlytable inner join users on (yearlytable.user_id = users.id) where yearlytable.user_id in(select id from users where country like \'United States%\' and NOT ("email" LIKE \'%yopmail.com%\')) group by users.state;')
But this also didn't work. Now I don't want to use CURSOR for this as I am afraid of SQL Injection issue. So Cursor is off the table.
for u in User.objects.raw('select users.state,sum(yearlytable.rotations) as sum_rotations,sum(yearlytable.calories) as sum_calories,sum(yearlytable.distance) as sum_distance from yearlytable inner join users on (yearlytable.user_id = users.id) where yearlytable.user_id in(select id from users where country like \'United States%\' and NOT ("email" LIKE \'%yopmail.com%\')) group by users.state;'):
print u
Below is the stack trace:
Traceback:
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/core/handlers/base.py" in get_response
111. response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/views/decorators/csrf.py" in wrapped_view
57. return view_func(*args, **kwargs)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/views/generic/base.py" in view
69. return self.dispatch(request, *args, **kwargs)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/rest_framework/views.py" in dispatch
407. response = self.handle_exception(exc)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/rest_framework/views.py" in dispatch
404. response = handler(request, *args, **kwargs)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/rest_framework/decorators.py" in handler
51. return func(*args, **kwargs)
File "/home/akki/rest_api/widget/views.py" in heat_map
18. for u in User.objects.raw('select users.state,sum(yearlytable.rotations) as sum_rotations,sum(yearlytable.calories) as sum_calories,sum(yearlytable.distance) as sum_distance from yearlytable inner join users on (yearlytable.user_id = users.id) where yearlytable.user_id in(select id from users where country like \'United States%\' and NOT ("email" LIKE \'%yopmail.com%\')) group by users.state;'):
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/db/models/query.py" in __iter__
1535. query = iter(self.query)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/db/models/sql/query.py" in __iter__
76. self._execute_query()
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/db/models/sql/query.py" in _execute_query
90. self.cursor.execute(self.sql, self.params)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/db/backends/utils.py" in execute
81. return super(CursorDebugWrapper, self).execute(sql, params)
File "/home/akki/rest_api/venv/local/lib/python2.7/site-packages/django/db/backends/utils.py" in execute
65. return self.cursor.execute(sql, params)
The Django ORM which I tried was:
YearlyTable.objects.annotate(r=Sum('rotations'))
It would be great to convert this sql query to django orm level.
Assumptions:
Use django ORM without resorting to raw SQL
Design the django models idiomatically, meaning related tables should use models ForeignKey, OneonOne or ManytoMany attributes.
YearlyTable assumed to have a one to one relationship with user.
In models.py:
from django.db import models
from django.contrib.auth.models import AbstractBaseUser
class User(AbstractBaseUser):
email = models.EmailField(max_length=255, unique=True)
first_name = models.CharField(max_length=255, blank=True, null=True)
city = models.CharField(max_length=200, blank=True, null=True)
state = models.CharField(max_length=200, blank=True, null=True)
postal_code = models.IntegerField(blank=True, null=True)
country = models.CharField(max_length=200, blank=True, null=True)
def __unicode__(self):
return self.email
class YearlyTable(models.Model):
user = models.OneToOneField('User', unique=True)
rotations = models.IntegerField()
calories = models.FloatField()
distance = models.FloatField()
duration = models.IntegerField(default=0)
year = models.IntegerField()
created = models.DateTimeField(auto_now_add=True)
modified = models.DateTimeField(auto_now=True)
def __unicode__(self):
return str(self.user)
I populated the tables with the following sample data:
u = User(email='a#w.com', first_name='ab', city='New York', state='New York', postal_code='12345', country='United States')
y = YearlyTable(user=u, rotations=10, calories=10.8, distance=12.5, duration=20, year=2011)
u = User(email='b#w.com', first_name='ac', city='Buffalo', state='New York', postal_code='67891', country='United States')
y = YearlyTable(user=u, rotations=8, calories=11.8, distance=11.5, duration=30, year=2012)
u = User(email='c#w.com', first_name='ad', city='Rochester', state='New York', postal_code='13579', country='United States')
y = YearlyTable(user=u, rotations=20, calories=15.8, distance=13.5, duration=40, year=2013)
u = User(email='d#w.com', first_name='ae', city='Pittsburgh', state='Pennsylvania', postal_code='98765', country='United States')
y = YearlyTable(user=u, rotations=30, calories=10.2, distance=12.5, duration=40, year=2012)
u = User(email='e#w.com', first_name='af', city='Los Angeles', state='California', postal_code='97531', country='United States')
y = YearlyTable(user=u, rotations=10, calories=14.8, distance=13.5, duration=10, year=2010)
Checking the physical tables and querying directly against it
psql -d
# select * from testapp_user;
id | password | last_login | email | first_name | city | state | postal_code | country
----+----------+------------+---------+------------+-------------+--------------+-------------+---------------
1 | | | a#w.com | ab | New York | New York | 12345 | United States
2 | | | b#w.com | ac | Buffalo | New York | 67891 | United States
3 | | | c#w.com | ad | Rochester | New York | 13579 | United States
4 | | | d#w.com | ae | Pittsburgh | Pennsylvania | 98765 | United States
5 | | | e#w.com | af | Los Angeles | California | 97531 | United States
(5 rows)
# select * from testapp_yearlytable;
id | rotations | calories | distance | duration | year | created | modified | user_id
----+-----------+----------+----------+----------+------+-------------------------------+-------------------------------+---------
1 | 10 | 10.8 | 12.5 | 20 | 2011 | 2016-05-17 16:23:46.39941+00 | 2016-05-17 16:23:46.399445+00 | 1
3 | 8 | 11.8 | 11.5 | 30 | 2012 | 2016-05-17 16:24:26.264569+00 | 2016-05-17 16:24:26.264606+00 | 2
4 | 20 | 15.8 | 13.5 | 40 | 2013 | 2016-05-17 16:24:51.200739+00 | 2016-05-17 16:24:51.200785+00 | 3
5 | 30 | 10.2 | 12.5 | 40 | 2012 | 2016-05-17 16:25:08.187799+00 | 2016-05-17 16:25:08.187852+00 | 4
6 | 10 | 14.8 | 13.5 | 10 | 2010 | 2016-05-17 16:25:24.846284+00 | 2016-05-17 16:25:24.846324+00 | 5
(5 rows)
# SELECT
testapp_user.state,
sum(testapp_yearlytable.rotations) as sum_rotations,
sum(testapp_yearlytable.calories) as sum_calories,
sum(testapp_yearlytable.distance) as sum_distance
FROM testapp_yearlytable
INNER JOIN testapp_user on (testapp_yearlytable.user_id = testapp_user.id)
WHERE testapp_yearlytable.user_id in
(SELECT id FROM testapp_user
WHERE country LIKE 'United States%' and
NOT ("email" LIKE '%a#w.com%'))
GROUP BY testapp_user.state;
state | sum_rotations | sum_calories | sum_distance
--------------+---------------+--------------+--------------
New York | 28 | 27.6 | 25
Pennsylvania | 30 | 10.2 | 12.5
California | 10 | 14.8 | 13.5
Running in python shell
> python manage.py shell
Python 2.7.6 (default, Jun 22 2015, 18:00:18)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from testapp.models import User, YearlyTable
>>> from django.db.models import Q, Sum
>>> User.objects.filter(~Q(email__icontains='a#w.com'), country__startswith='United States') \
... .values('state') \
... .annotate(sum_rotations = Sum('yearlytable__rotations'), \
... sum_calories = Sum('yearlytable__calories'), \
... sum_distance = Sum('yearlytable__distance'))
[{'sum_rotations': 28, 'state': u'New York', 'sum_calories': 27.6, 'sum_distance': 25.0}, {'sum_rotations': 30, 'state': u'Pennsylvania', 'sum_calories': 10.2, 'sum_distance': 12.5}, {'sum_rotations': 10, 'state': u'California', 'sum_calories': 14.8, 'sum_distance': 13.5}]
It seems like this can be done using the Aggregation Framework with the following ORM query:
1) We filter on the User to find those which match the inner most SELECT statement. This is returns a list of the User.id.
2) values() is used first on the YearlyTable will perform the GROUP BY on User.state.
3) distinct() is used to ensure we only account for each possible User.state once.
4) annotate() is used to perform the Sum of the values you wanted.
5) Finally we call values() again to make dictionaries containing the information you requested in the top level SELECT query.
from django.db.models import Sum
YearlyTable.objects.filter(
user_id__in=User.objects.filter(
country__startswith='United States'
).exclude(
email__contains='yopmail.com'
).values_list('id', flat=True)
).values('user__state').distinct().annotate(
sum_rotations=Sum('rotations'),
sum_calories=Sum('calories'),
sum_distance=Sum('distance')
).values('user__state', 'sum_rotations', 'sum_calories', 'sum_distance')
Consider the following models:
class Publisher(models.Model):
name = models.CharField(max_length=300)
num_awards = models.IntegerField()
class Book(models.Model):
name = models.CharField(max_length=300)
pages = models.IntegerField()
publisher = models.ForeignKey(Publisher, related_name='related_books')
From a Publisher instance how can I get the number of book by distinct value on pages field? For example:
| name | pages | publisher |
|-----------|-------|-----------|
| Golden | 20 | 1 |
| Grey | 23 | 1 |
| Blue | 20 | 1 |
| Grotesque | 27 | 2 |
If I have publisher = Publisher.objects.get(id=1) how can I achieve something like this:
# equals to 2 [Golden, Grey]
publisher.related_books.all().distinct('pages').count()
You were close, you just need to restrict returned values, like so:
publisher.related_books.all().values('pages').distinct('pages').count()
This will just give you the number of different page lengths for a publisher, but not the associated books for each page length. To do that you'd probably need an extra query.
If you want reusable queries, you could do this:
class BookQuerySet(models.QuerySet):
def by_publisher(self, publisher):
return self.filter(publisher=publisher)
def distinct_number_of_pages(self):
return self.distinct(pages)
class Book(...):
...
objects = BookQuerySet.as_manager()
class Publisher(...):
#property
def number_of_page_lengths(self):
return Book.objects.by_publisher(self).distinct_number_of_pages().count()
I have two models that look like this:
class Node(models.Model):
user = models.ForeignKey(User, null=False)
name = models.CharField(max_length=50)
class Activation(models.Model):
node = models.ForeignKey(Node, null=False)
active = models.BooleanField(default=False)
datetime = models.DateTimeField(default=datetimeM.datetime.now)
The activation table stores whether a given node is "active" or not. So to figure out whether a node is active, one needs to get the latest activation record for that node.
I'm trying to figure out how to write a django query that returns all active nodes.
Here is some example data
Node Table
id | name
--------------------
0 | andrew
1 | bill
2 | bob
Activation Table
id | nodeId | active | datetime
--------------------
0 | 0 | false | 01-01-2013:00:01:02
1 | 0 | true | 01-02-2013:00:01:02
2 | 0 | false | 01-03-2013:00:01:02
3 | 1 | false | 01-04-2013:00:01:02
4 | 0 | true | 01-05-2013:00:01:02
5 | 1 | true | 01-06-2013:00:01:02
6 | 2 | false | 01-07-2013:00:01:02
So the query would need to return [node0, node1]
class Node(models.Model):
user = models.ForeignKey(User, null=False)
name = models.CharField(max_length=50)
class Activation(models.Model):
node = models.ForeignKey(Node, null=False, related_name='activations')
active = models.BooleanField(default=False)
datetime = models.DateTimeField(default=datetimeM.datetime.now)
#latest activation record for that node:
try:
latest_activation = node.activations.latest('id')
except:
latest_activation = None
# Return all active notes:
all_active_notes = Node.objects.filter(activations__active=True)
Updated:
Check this question what I posted yestarday:
Django reverse query by the last created object
maybe this will help you.