Django lazy QuerySet and pagination - django

I read here that Django querysets are lazy, it won't be evaluated until it is actually printed. I have made a simple pagination using the django's built-in pagination. I didn't realize there were apps already such as "django-pagination", and "django-endless" which does that job for.
Anyway I wonder whether the QuerySet is still lazy when I for example do this
entries = Entry.objects.filter(...)
paginator = Paginator(entries, 10)
output = paginator.page(page)
return HttpResponse(output)
And this part is called every time I want to get whatever page I currently I want to view.
I need to know since I don't want unnecessary load to the database.

If you want to see where are occurring, import django.db.connection and inspect queries
>>> from django.db import connection
>>> from django.core.paginator import Paginator
>>> queryset = Entry.objects.all()
Lets create the paginator, and see if any queries occur:
>>> paginator = Paginator(queryset, 10)
>>> print connection.queries
[]
None yet.
>>> page = paginator.page(4)
>>> page
<Page 4 of 788>
>>> print connection.queries
[{'time': '0.014', 'sql': 'SELECT COUNT(*) FROM `entry`'}]
Creating the page has produced one query, to count how many entries are in the queryset. The entries have not been fetched yet.
Assign the page's objects to the variable 'objects':
>>> objects = page.object_list
>>> print connection.queries
[{'time': '0.014', 'sql': 'SELECT COUNT(*) FROM `entry`'}]
This still hasn't caused the entries to be fetched.
Generate the HttpResponse from the object list
>>> response = HttpResponse(page.object_list)
>>> print connection.queries
[{'time': '0.014', 'sql': 'SELECT COUNT(*) FROM `entry`'}, {'time': '0.011', 'sql': 'SELECT `entry`.`id`, <snip> FROM `entry` LIMIT 10 OFFSET 30'}]
Finally, the entries have been fetched.

It is. Django's pagination uses the same rules/optimizations that apply to querysets.
This means it will start evaluating on return HttpResponse(output)

Related

Performance of django API view pagination

I am new to python django. I am using APIView. I am looking at pagination code. I have looked through many codes but in all those, i have a concern.
They all get all the data from the table and then paginate that data.
zones = Zone.objects.all()
paginator = Paginator(zones, 2)
page = 2
zones = paginator.page(page)
serializer = ZoneSerializer(zones, many=True)
return {"data": serializer.data, 'count': zones.paginator.count, "code": status.HTTP_200_OK, "message": 'OK'}
My expectation is that i don't get all the records and then paginate using paginator. Otherwise i will have to write my own code to handle it.
It is not true that it gets all the records from database.
Look at this ( using django shell ). Note the LIMIT:
from django.db import connection
from apps.question.models import Question
from django.core.paginator import Paginator
p = Paginator(Question.objects.all(),2)
print(connection.queries)
[]
p.page(1)[0] # Accessing one element in page
print(connection.queries)
[{'sql': 'SELECT COUNT(*) AS "__count" FROM "question"',
'time': '0.001'},
{'sql': 'SELECT <all fields> FROM "question" ORDER BY "question"."id" DESC LIMIT 2',
'time': '0.000'},
]
Note: I removed the list of all fields from the 2nd query so it fits nicely here.

Django : random ordering(order_by('?')) makes additional query

Here is sample codes in django.
[Case 1]
views.py
from sampleapp.models import SampleModel
from django.core.cache import cache
def get_filtered_data():
result = cache.get("result")
# make cache if result not exists
if not result:
result = SampleModel.objects.filter(field_A="foo")
cache.set("result", result)
return render_to_response('template.html', locals(), context_instance=RequestContext(request))
template.html
{% for case in result %}
<p>{{ case.field_A}}</p>
{% endfor %}
In this case, there's no generated query after cache made. I checked it by django_debug_toolbar.
[Case 2]
views.py - added one line result = result.order_by('?')
from sampleapp.models import SampleModel
from django.core.cache import cache
def get_filtered_data():
result = cache.get("result")
# make cache if result not exists
if not result:
result = SampleModel.objects.filter(field_A="foo")
cache.set("result", result)
result = result.order_by('?')
return render_to_response('template.html', locals(), context_instance=RequestContext(request))
template.html - same as previous one
In this case, it generated new query even though I cached filtered query.
How can I adapt random ordering without additional queryset?
I can't put order_by('?') when making a cache.
(e.g. result = SampleModel.objects.filter(field_A="foo").order_by('?'))
Because it even caches random order.
Is it related with 'django queryset is lazy' ?
Thanks in advance.
.order_by performs sorting at database level.
Here is an example. We store lasy queryset in var results. No query has been made yet:
results = SampleModel.objects.filter(field_A="foo")
Touch the results, for example, by iterating it:
for r in results: # here query was send to database
# ...
Now, if we'll do it again, no attempt to database will be made, as we already have this exact query:
for r in results: # no query send to database
# ...
But, when you apply .order_by, the query will be different. So, django has to send new request to database:
for r in results.order_by('?'): # new query was send to database
# ...
Solution
When you do the query in django, and you know, that you will get all elements from that query (i.e., no OFFSET and LIMIT), then you can process those elements in python, after you get them from database.
results = list(SampleModel.objects.filter(field_A="foo")) # convert here queryset to list
At that line query was made and you have all elements in results.
If you need to get random order, do it in python now:
from random import shuffle
shuffle(results)
After that, results will have random order without additional query being send to database.

In Django filter statement what's the difference between __exact and equal sign (=)?

In Django filter statement what's the difference if I write:
.filter(name__exact='Alex')
and
.filter(name='Alex')
Thanks
There is no difference, the second one implies using the __exact.
From the documentation:
For example, the following two statements are equivalent:
>>> Blog.objects.get(id__exact=14) # Explicit form
>>> Blog.objects.get(id=14)
# __exact is implied This is for convenience, because exact
# lookups are the common case.
You can look at the SQL that Django will execute by converting the queryset's query property to a string:
>>> from django.contrib.auth.models import User
>>> str(User.objects.filter(username = 'name').query)
'SELECT ... WHERE `auth_user`.`username` = name '
>>> str(User.objects.filter(username__exact = 'name').query)
'SELECT ... WHERE `auth_user`.`username` = name '
So __exact makes no difference here.
This is not exactly the same as the question but can be useful for some developers.
It depends on Django database and collation. I am using mysql db and ci(case insensitive) collation and have a strange result.
If there is User "Test" and query with space in the end
In : User.objects.filter(username__iexact="Test ")
Out : <QuerySet []>
In : User.objects.filter(username__exact="Test ")
Out : <QuerySet [<User: Test>]>
In : User.objects.filter(username="Test ")
Out : <QuerySet [<User: Test>]>
from django.test import TestCase
from user.factories import UserFactory
from user.models import User
class TestCaseSensitiveQueryTestCase(TestCase):
def setUp(self):
super().setUp()
self.QUERY = 'case sensitive username'
self.USERNAME = 'cAse SEnSItIVE UsErNAME'
self.user = UserFactory(name=self.USERNAME)
def test_implicit_exact_match(self):
with self.assertRaises(User.DoesNotExist):
User.objects.get(name=self.QUERY)
def test_explicit_iexact_match(self):
User.objects.get(name__iexact=self.QUERY)

Django Order By Date, but have "None" at end?

I have a model of work orders, with a field for when the work order is required by. To get a list of work orders, with those that are required early, I do this:
wo = Work_Order.objects.order_by('dateWORequired')
This works nicely, but ONLY if there is actually a value in that field. If there is no required date, then the value is None. Then, the list of work orders has all the None's at the top, and then the remaining work orders in proper order.
How can I get the None's at the bottom?
Django 1.11 added this as a native feature. It's a little convoluted. It is documented.
Ordered with only one field, ascending:
wo = Work_Order.objects.order_by(F('dateWORequired').asc(nulls_last=True))
Ordered using two fields, both descending:
wo = Work_Order.objects.order_by(F('dateWORequired').desc(nulls_last=True), F('anotherfield').desc(nulls_last=True))
q = q.extra(select={
'date_is_null': 'dateWORequired IS NULL',
},
order_by=['date_is_null','dateWORequired'],
)
You might need a - before the date_is_null in the order_by portion, but that's how you can control the behavior.
This was not available when the question was asked, but since Django 1.8 I think this is the best solution:
from django.db.models import Coalesce, Value
long_ago = datetime.datetime(year=1980, month=1, day=1)
Work_Order.objects.order_by('dateWORequired')
MyModel.objects.annotate(date_null=
Coalesce('dateWORequired', Value(long_ago))).order_by('date_null')
Coalesce selects the first non-null value, so you create a value date_null to order by which is just dateWORequired but with null replaced by a date long ago.
Requirements:
Python 3.4, Django 10.2, PostgreSQL 9.5.4
Variant 1
Solution:
class IsNull(models.Func):
template = "%(expressions)s IS NULL"
Usage (None always latest):
In [1]: a = User.polls_manager.users_as_voters()
In [4]: from django.db import models
In [5]: class IsNull(models.Func):
...: template = "%(expressions)s IS NULL"
...:
In [7]: a = a.annotate(date_latest_voting_isnull=IsNull('date_latest_voting'))
In [9]: for i in a.order_by('date_latest_voting_isnull', 'date_latest_voting'):
...: print(i.date_latest_voting)
...:
2016-07-30 01:48:11.872911+00:00
2016-08-31 13:13:47.240085+00:00
2016-09-16 00:04:23.042142+00:00
2016-09-18 19:45:54.958573+00:00
2016-09-26 07:27:34.301295+00:00
2016-10-03 14:01:08.377417+00:00
2016-10-21 16:07:42.881526+00:00
2016-10-23 11:10:02.342791+00:00
2016-10-31 04:09:03.726765+00:00
None
In [10]: for i in a.order_by('date_latest_voting_isnull', '-date_latest_voting'):
...: print(i.date_latest_voting)
...:
2016-10-31 04:09:03.726765+00:00
2016-10-23 11:10:02.342791+00:00
2016-10-21 16:07:42.881526+00:00
2016-10-03 14:01:08.377417+00:00
2016-09-26 07:27:34.301295+00:00
2016-09-18 19:45:54.958573+00:00
2016-09-16 00:04:23.042142+00:00
2016-08-31 13:13:47.240085+00:00
2016-07-30 01:48:11.872911+00:00
None
Notes
Based on https://www.isotoma.com/blog/2015/11/23/sorting-querysets-with-nulls-in-django/
Drawback: unnecessary buffer field, overhead for ordering
Variant 2
Solution:
from django.db import models
from django.db import connections
from django.db.models.sql.compiler import SQLCompiler
class NullsLastCompiler(SQLCompiler):
# source code https://github.com/django/django/blob/master/django/db/models/sql/compiler.py
def get_order_by(self):
result = super(NullsLastCompiler, self).get_order_by()
# if result exists and backend is PostgreSQl
if result and self.connection.vendor == 'postgresql':
# modified raw SQL code to ending on NULLS LAST after ORDER BY
# more info https://www.postgresql.org/docs/9.5/static/queries-order.html
result = [
(expression, (sql + ' NULLS LAST', params, is_ref))
for expression, (sql, params, is_ref) in result
]
return result
class NullsLastQuery(models.sql.Query):
# source code https://github.com/django/django/blob/master/django/db/models/sql/query.py
def get_compiler(self, using=None, connection=None):
if using is None and connection is None:
raise ValueError("Need either using or connection")
if using:
connection = connections[using]
# return own compiler
return NullsLastCompiler(self, connection, using)
class NullsLastQuerySet(models.QuerySet):
# source code https://github.com/django/django/blob/master/django/db/models/query.py
def __init__(self, model=None, query=None, using=None, hints=None):
super(NullsLastQuerySet, self).__init__(model, query, using, hints)
# replace on own Query
self.query = query or NullsLastQuery(model)
Usage:
# instead of models.QuerySet use NullsLastQuerySet
class UserQuestionQuerySet(NullsLastQuerySet):
def users_with_date_latest_question(self):
return self.annotate(date_latest_question=models.Max('questions__created'))
#connect to a model as a manager
class User(AbstractBaseUser, PermissionsMixin):
.....
questions_manager = UserQuestionQuerySet().as_manager()
Results (None always latest):
In [2]: qs = User.questions_manager.users_with_date_latest_question()
In [3]: for i in qs:
...: print(i.date_latest_question)
...:
None
None
None
2016-10-28 20:48:49.005593+00:00
2016-10-04 19:01:38.820993+00:00
2016-09-26 00:35:07.839646+00:00
None
2016-07-27 04:33:58.508083+00:00
2016-09-14 10:40:44.660677+00:00
None
In [4]: for i in qs.order_by('date_latest_question'):
...: print(i.date_latest_question)
...:
2016-07-27 04:33:58.508083+00:00
2016-09-14 10:40:44.660677+00:00
2016-09-26 00:35:07.839646+00:00
2016-10-04 19:01:38.820993+00:00
2016-10-28 20:48:49.005593+00:00
None
None
None
None
None
In [5]: for i in qs.order_by('-date_latest_question'):
...: print(i.date_latest_question)
...:
2016-10-28 20:48:49.005593+00:00
2016-10-04 19:01:38.820993+00:00
2016-09-26 00:35:07.839646+00:00
2016-09-14 10:40:44.660677+00:00
2016-07-27 04:33:58.508083+00:00
None
None
None
None
None
Notes:
Based on the Django: Adding "NULLS LAST" to query and the Django`s source code
Global on all fields of a model (it is advantage and disadvantage simultaneously)
No a unnecessary field
A drawback - tested only on the PostgreSQL
I endeavoured to get this working with pure Django, without dropping into SQL.
The F() expression function can be used with order_by, so I tried to concoct a way of creating an expression which sets all numbers to the same value, but which sets all NULLs to another specific value.
MySQL will order NULLs before 0s in ascending order, and vice versa in descending order.
So this works:
order_by( (0 * F('field')).asc() ) # Nulls first
# or:
order_by( (0 * F('field')).desc() ) # Nulls last
You can then pass any other fields to the same order_by call, before or after that expression.
I've tried it with dates and the same happens. e.g.:
SELECT 0*CURRENT_TIMESTAMP;
Evaluates to 0.

Flexible pagination in Django

I'd like to implement pagination such that I can allow the user to choose the number of records per page such as 10, 25, 50 etc. How should I go about this? Is there an app I can add onto my project to do this?
Thanks
Django has a Paginator object built into core. It's a rather straightforward API to use. Instantiate a Paginator class with two arguments: the list and the number of entries per "page". I'll paste some sample code at the bottom.
In your case you want to allow the user to choose the per-page count. You could either make the per-page count part of the URL (ie. your/page/10/) or you could make it a query string (ie. your/page/?p=10).
Something like...
# Assuming you're reading the Query String value ?p=
try:
per_page = int(request.REQUEST['p'])
except:
per_page = 25 # default value
paginator = Paginator(objects, per_page)
Here's some sample code from the Django doc page for the Paginator to better see how it works.
>>> from django.core.paginator import Paginator
>>> objects = ['john', 'paul', 'george', 'ringo']
>>> p = Paginator(objects, 2)
>>> p.count
4
>>> p.num_pages
2
>>> p.page_range
[1, 2]
>>> page1 = p.page(1)
>>> page1
<Page 1 of 2>
>>> page1.object_list
['john', 'paul']
google on "django pagination" and make sure to use "covering index" in your SQL for efficient query.
T. Stone's answer covers most of what I was going to say. I just want to add that you can use pagination in Generic Views. In particular, you may find django.views.generic.list_detail.object_list useful.
You can write a small wrapper function that gets the number of objects to display per page from the request object, then calls object_list.
def paginated_object_list(request, page):
my_queryset=MyModel.objects.all()
#Here's T. Stone's code to get the number of items per page
try:
per_page = int(request.REQUEST['p'])
except:
per_page = 25 # default value
return object_list(request, queryset=my_queryset,
paginate_by=per_page, page=page)
Then, the context for your template will contain the variables,
paginator: An instance of django.core.paginator.Paginator.
page_obj: An instance of django.core.paginator.Page.
and you can loop through page_obj to display the objects for that page.
What does need? Well.
You can add custom control for change_list.html, for pagination block for example.
This will be reload list page with get parameter per_page for example with certain value onchange event.
For your adminModel you must override changelist_view method where you must handle get parameter and set this value as list_per_page field value.
def changelist_view(self, request):
if request.GET.get('per_page') and int(
request.GET.get('per_page')) in CHANGELIST_PERPAGE_LIMITS:
self.list_per_page = int(request.GET.get('per_page'))
else:
self.list_per_page = 100
extra_context = {'changelist_perpage_limits': CHANGELIST_PERPAGE_LIMITS,
'list_per_page': self.list_per_page}
return super(mymodelAdmin, self).changelist_view(request, extra_context)
I use extra_context for access to this values into template. Maybe there is more neat approach to access i don't know :-)