I have a model of work orders, with a field for when the work order is required by. To get a list of work orders, with those that are required early, I do this:
wo = Work_Order.objects.order_by('dateWORequired')
This works nicely, but ONLY if there is actually a value in that field. If there is no required date, then the value is None. Then, the list of work orders has all the None's at the top, and then the remaining work orders in proper order.
How can I get the None's at the bottom?
Django 1.11 added this as a native feature. It's a little convoluted. It is documented.
Ordered with only one field, ascending:
wo = Work_Order.objects.order_by(F('dateWORequired').asc(nulls_last=True))
Ordered using two fields, both descending:
wo = Work_Order.objects.order_by(F('dateWORequired').desc(nulls_last=True), F('anotherfield').desc(nulls_last=True))
q = q.extra(select={
'date_is_null': 'dateWORequired IS NULL',
},
order_by=['date_is_null','dateWORequired'],
)
You might need a - before the date_is_null in the order_by portion, but that's how you can control the behavior.
This was not available when the question was asked, but since Django 1.8 I think this is the best solution:
from django.db.models import Coalesce, Value
long_ago = datetime.datetime(year=1980, month=1, day=1)
Work_Order.objects.order_by('dateWORequired')
MyModel.objects.annotate(date_null=
Coalesce('dateWORequired', Value(long_ago))).order_by('date_null')
Coalesce selects the first non-null value, so you create a value date_null to order by which is just dateWORequired but with null replaced by a date long ago.
Requirements:
Python 3.4, Django 10.2, PostgreSQL 9.5.4
Variant 1
Solution:
class IsNull(models.Func):
template = "%(expressions)s IS NULL"
Usage (None always latest):
In [1]: a = User.polls_manager.users_as_voters()
In [4]: from django.db import models
In [5]: class IsNull(models.Func):
...: template = "%(expressions)s IS NULL"
...:
In [7]: a = a.annotate(date_latest_voting_isnull=IsNull('date_latest_voting'))
In [9]: for i in a.order_by('date_latest_voting_isnull', 'date_latest_voting'):
...: print(i.date_latest_voting)
...:
2016-07-30 01:48:11.872911+00:00
2016-08-31 13:13:47.240085+00:00
2016-09-16 00:04:23.042142+00:00
2016-09-18 19:45:54.958573+00:00
2016-09-26 07:27:34.301295+00:00
2016-10-03 14:01:08.377417+00:00
2016-10-21 16:07:42.881526+00:00
2016-10-23 11:10:02.342791+00:00
2016-10-31 04:09:03.726765+00:00
None
In [10]: for i in a.order_by('date_latest_voting_isnull', '-date_latest_voting'):
...: print(i.date_latest_voting)
...:
2016-10-31 04:09:03.726765+00:00
2016-10-23 11:10:02.342791+00:00
2016-10-21 16:07:42.881526+00:00
2016-10-03 14:01:08.377417+00:00
2016-09-26 07:27:34.301295+00:00
2016-09-18 19:45:54.958573+00:00
2016-09-16 00:04:23.042142+00:00
2016-08-31 13:13:47.240085+00:00
2016-07-30 01:48:11.872911+00:00
None
Notes
Based on https://www.isotoma.com/blog/2015/11/23/sorting-querysets-with-nulls-in-django/
Drawback: unnecessary buffer field, overhead for ordering
Variant 2
Solution:
from django.db import models
from django.db import connections
from django.db.models.sql.compiler import SQLCompiler
class NullsLastCompiler(SQLCompiler):
# source code https://github.com/django/django/blob/master/django/db/models/sql/compiler.py
def get_order_by(self):
result = super(NullsLastCompiler, self).get_order_by()
# if result exists and backend is PostgreSQl
if result and self.connection.vendor == 'postgresql':
# modified raw SQL code to ending on NULLS LAST after ORDER BY
# more info https://www.postgresql.org/docs/9.5/static/queries-order.html
result = [
(expression, (sql + ' NULLS LAST', params, is_ref))
for expression, (sql, params, is_ref) in result
]
return result
class NullsLastQuery(models.sql.Query):
# source code https://github.com/django/django/blob/master/django/db/models/sql/query.py
def get_compiler(self, using=None, connection=None):
if using is None and connection is None:
raise ValueError("Need either using or connection")
if using:
connection = connections[using]
# return own compiler
return NullsLastCompiler(self, connection, using)
class NullsLastQuerySet(models.QuerySet):
# source code https://github.com/django/django/blob/master/django/db/models/query.py
def __init__(self, model=None, query=None, using=None, hints=None):
super(NullsLastQuerySet, self).__init__(model, query, using, hints)
# replace on own Query
self.query = query or NullsLastQuery(model)
Usage:
# instead of models.QuerySet use NullsLastQuerySet
class UserQuestionQuerySet(NullsLastQuerySet):
def users_with_date_latest_question(self):
return self.annotate(date_latest_question=models.Max('questions__created'))
#connect to a model as a manager
class User(AbstractBaseUser, PermissionsMixin):
.....
questions_manager = UserQuestionQuerySet().as_manager()
Results (None always latest):
In [2]: qs = User.questions_manager.users_with_date_latest_question()
In [3]: for i in qs:
...: print(i.date_latest_question)
...:
None
None
None
2016-10-28 20:48:49.005593+00:00
2016-10-04 19:01:38.820993+00:00
2016-09-26 00:35:07.839646+00:00
None
2016-07-27 04:33:58.508083+00:00
2016-09-14 10:40:44.660677+00:00
None
In [4]: for i in qs.order_by('date_latest_question'):
...: print(i.date_latest_question)
...:
2016-07-27 04:33:58.508083+00:00
2016-09-14 10:40:44.660677+00:00
2016-09-26 00:35:07.839646+00:00
2016-10-04 19:01:38.820993+00:00
2016-10-28 20:48:49.005593+00:00
None
None
None
None
None
In [5]: for i in qs.order_by('-date_latest_question'):
...: print(i.date_latest_question)
...:
2016-10-28 20:48:49.005593+00:00
2016-10-04 19:01:38.820993+00:00
2016-09-26 00:35:07.839646+00:00
2016-09-14 10:40:44.660677+00:00
2016-07-27 04:33:58.508083+00:00
None
None
None
None
None
Notes:
Based on the Django: Adding "NULLS LAST" to query and the Django`s source code
Global on all fields of a model (it is advantage and disadvantage simultaneously)
No a unnecessary field
A drawback - tested only on the PostgreSQL
I endeavoured to get this working with pure Django, without dropping into SQL.
The F() expression function can be used with order_by, so I tried to concoct a way of creating an expression which sets all numbers to the same value, but which sets all NULLs to another specific value.
MySQL will order NULLs before 0s in ascending order, and vice versa in descending order.
So this works:
order_by( (0 * F('field')).asc() ) # Nulls first
# or:
order_by( (0 * F('field')).desc() ) # Nulls last
You can then pass any other fields to the same order_by call, before or after that expression.
I've tried it with dates and the same happens. e.g.:
SELECT 0*CURRENT_TIMESTAMP;
Evaluates to 0.
Related
Similar to an exclude filter that looks like this:
MyObject.objects.exclude(my_field="")
I know that I can modify the view to exclude the value in the queryset but that's not dynamic:
def get_queryset(self):
return Client.objects.exclude(my_field="")
But I'm looking for a less static way of querying via URL. So something like this:
/api/object?my_field__isempty=False
#Hashem's answer is good, but you could also use Q statements
from django.db.models import Q
# Q allows for complex queries with:
# & (AND)
# | (OR)
# ^ (XOR)
# Exclude method
ModelName.objects.exclude(Q(field_name__isnull=True) | Q(field_name__exact=''))
# Filter Method with ~ (NOT)
ModelName.objects.filter(~Q(field_name__isnull=True) & ~Q(field_name__exact=''))
Just throwing out another option, and knowing about Q statements is really beneficial
Docs: https://docs.djangoproject.com/en/4.1/topics/db/queries/#complex-lookups-with-q
Edit
Here's an extra tidbit, you'll find this handy, you can use pointers and filter by a dictionary. The dictionary can be created dynamically
filterDict = {'my_field':''}
Client.objects.exclude(**filterDict)
# Client.objects.exclude(**filterDict) === Client.objects.exclude(my_field=='')
I'm not sure how you are doing Views, but if you have a "normal" one with the request object you you can fetch the GET parameters as a QueryDict:
def myview_with_dynamic_pointer(request):
print(request.GET)
# Should work
Client.objects.filter(**request.GET)
# will work
Client.objects.filter(**{k:v for k,v in request.GET.items()})
def myview_with_dynamic_Q(request):
print(request.GET)
from django.db.models import Q
dynamicquery = Q()
for key, value in request.GET.items():
dynamicquery = dynamicquery & Q({k:v})
# Can also do OR:
# dynamicquery = dynamicquery | Q({k:v})
Client.objects.filter(dynamicquery)
If you are using class based views, it'll be more like:
class MyListView(ListView):
model = MyModel
def get_queryset(self):
print(self.request.GET)
return self.model.objects.filter(**self.request.GET)
It might be a good idea to look over the GET before shoving it right into a filter.. Safety wise
If I got you correctly,
If you need to exclude null values and empty strings
ModelName.objects.exclude(field_name__isnull=True).exclude(field_name__exact='')
Or you can use one of them, this equal to
NOT field_name='' AND field_name IS NOT NULL
I use django-import-export 2.8.0 with Oracle 12c.
Line-by-line import via import_data() works without problems, but when I turn on the use_bulk=True option, it stops importing and does not throw any errors.
Why does not it work?
resources.py
class ClientsResources(resources.ModelResource):
class Meta:
model = Clients
fields = ('id', 'name', 'surname', 'age', 'is_active')
batch_size = 1000
use_bulk = True
raise_errors = True
views.py
def import_data(request):
if request.method == 'POST':
file_format = request.POST['file-format']
new_employees = request.FILES['importData']
clients_resource = ClientsResources()
dataset = Dataset()
imported_data = dataset.load(new_employees.read().decode('utf-8'), format=file_format)
result = clients_resource.import_data(imported_data, dry_run=True, raise_errors=True)
if not result.has_errors():
clients_resource.import_data(imported_data, dry_run=False)
return HttpResponseRedirect(request.META.get('HTTP_REFERER'))
data.csv
id,name,surname,age,is_active
18,XSXQAMA,BEHKZFI,89,Y
19,DYKNLVE,ZVYDVCX,20,Y
20,GPYXUQE,BCSRUSA,73,Y
21,EFHOGJJ,MXTWVST,93,Y
22,OGRCEEQ,KJZVQEG,52,Y
--UPD--
I used django-debug-toolbar and saw a very strange behavior with import-queries.
With Admin Panel doesnt work. I see all importing rows, but next it writes "Import finished, with 5 new and 0 updated clients.", and see this strange queries
Then I use import by my form and here simultaneous situation:
use_bulk by django-import-export (more)
And for comparing my handle create_bulk()
--UPD2--
I've tried to trail import logic and look what I found:
import_export/resources.py
def bulk_create(self, using_transactions, dry_run, raise_errors, batch_size=None):
"""
Creates objects by calling ``bulk_create``.
"""
print(self.create_instances)
try:
if len(self.create_instances) > 0:
if not using_transactions and dry_run:
pass
else:
self._meta.model.objects.bulk_create(self.create_instances, batch_size=batch_size)
except Exception as e:
logger.exception(e)
if raise_errors:
raise e
finally:
self.create_instances.clear()
This print() showed empty list in value.
This issue appears to be due to a bug in the 2.x version of django-import-export. It is fixed in v3.
The bug is present when running in bulk mode (use_bulk=True)
The logic in save_instance() is finding that 'new' instances have pk values set, and are then incorrectly treating them as updates, not creates.
I cannot determine how this would happen. It's possible this is related to using Oracle (though I cannot see how).
I try to use task queues on Google App Engine. I want to utilize the Mapper class shown in the App Engine documentation "Background work with the deferred library".
I get an exception on the ordering of the query result by the key
def get_query(self):
...
q = q.order("__key__")
...
Exception:
File "C:... mapper.py", line 41, in get_query
q = q.order("__key__")
File "C:\Program Files (x86)\Google\google_appengine\google\appengine\ext\ndb\query.py", line 1124, in order
'received %r' % arg)
TypeError: order() expects a Property or query Order; received '__key__'
INFO 2017-03-09 11:56:32,448 module.py:806] default: "POST /_ah/queue/deferred HTTP/1.1" 500 114
The article is from 2009, so I guess something might have changed.
My environment: Windows 7, Python 2.7.9, Google App Engine SDK 1.9.50
There are somewhat similar questions about ordering in NDB on SO.
What bugs me this code is from the official doc, presumably updated in Feb 2017 (recently) and posted by someone within top 0.1 % of SO users by reputation.
So I must be doing something wrong. What is the solution?
Bingo.
Avinash Raj is correct. If it were an answer I'd accept it.
Here is the full class code
#!/usr/bin/python2.7
# -*- coding: utf-8 -*-
from google.appengine.ext import deferred
from google.appengine.ext import ndb
from google.appengine.runtime import DeadlineExceededError
import logging
class Mapper(object):
"""
from https://cloud.google.com/appengine/docs/standard/python/ndb/queries
corrected with suggestions from Stack Overflow
http://stackoverflow.com/questions/42692319/how-to-order-ndb-query-by-the-key
"""
# Subclasses should replace this with a model class (eg, model.Person).
KIND = None
# Subclasses can replace this with a list of (property, value) tuples to filter by.
FILTERS = []
def __init__(self):
logging.info("Mapper.__init__: {}")
self.to_put = []
self.to_delete = []
def map(self, entity):
"""Updates a single entity.
Implementers should return a tuple containing two iterables (to_update, to_delete).
"""
return ([], [])
def finish(self):
"""Called when the mapper has finished, to allow for any final work to be done."""
pass
def get_query(self):
"""Returns a query over the specified kind, with any appropriate filters applied."""
q = self.KIND.query()
for prop, value in self.FILTERS:
q = q.filter(prop == value)
if __name__ == '__main__':
q = q.order(self.KIND.key) # the fixed version. The original q.order('__key__') failed
# see http://stackoverflow.com/questions/42692319/how-to-order-ndb-query-by-the-key
return q
def run(self, batch_size=100):
"""Starts the mapper running."""
logging.info("Mapper.run: batch_size: {}".format(batch_size))
self._continue(None, batch_size)
def _batch_write(self):
"""Writes updates and deletes entities in a batch."""
if self.to_put:
ndb.put_multi(self.to_put)
self.to_put = []
if self.to_delete:
ndb.delete_multi(self.to_delete)
self.to_delete = []
def _continue(self, start_key, batch_size):
q = self.get_query()
# If we're resuming, pick up where we left off last time.
if start_key:
key_prop = getattr(self.KIND, '_key')
q = q.filter(key_prop > start_key)
# Keep updating records until we run out of time.
try:
# Steps over the results, returning each entity and its index.
for i, entity in enumerate(q):
map_updates, map_deletes = self.map(entity)
self.to_put.extend(map_updates)
self.to_delete.extend(map_deletes)
# Do updates and deletes in batches.
if (i + 1) % batch_size == 0:
self._batch_write()
# Record the last entity we processed.
start_key = entity.key
self._batch_write()
except DeadlineExceededError:
# Write any unfinished updates to the datastore.
self._batch_write()
# Queue a new task to pick up where we left off.
deferred.defer(self._continue, start_key, batch_size)
return
self.finish()
I read here that Django querysets are lazy, it won't be evaluated until it is actually printed. I have made a simple pagination using the django's built-in pagination. I didn't realize there were apps already such as "django-pagination", and "django-endless" which does that job for.
Anyway I wonder whether the QuerySet is still lazy when I for example do this
entries = Entry.objects.filter(...)
paginator = Paginator(entries, 10)
output = paginator.page(page)
return HttpResponse(output)
And this part is called every time I want to get whatever page I currently I want to view.
I need to know since I don't want unnecessary load to the database.
If you want to see where are occurring, import django.db.connection and inspect queries
>>> from django.db import connection
>>> from django.core.paginator import Paginator
>>> queryset = Entry.objects.all()
Lets create the paginator, and see if any queries occur:
>>> paginator = Paginator(queryset, 10)
>>> print connection.queries
[]
None yet.
>>> page = paginator.page(4)
>>> page
<Page 4 of 788>
>>> print connection.queries
[{'time': '0.014', 'sql': 'SELECT COUNT(*) FROM `entry`'}]
Creating the page has produced one query, to count how many entries are in the queryset. The entries have not been fetched yet.
Assign the page's objects to the variable 'objects':
>>> objects = page.object_list
>>> print connection.queries
[{'time': '0.014', 'sql': 'SELECT COUNT(*) FROM `entry`'}]
This still hasn't caused the entries to be fetched.
Generate the HttpResponse from the object list
>>> response = HttpResponse(page.object_list)
>>> print connection.queries
[{'time': '0.014', 'sql': 'SELECT COUNT(*) FROM `entry`'}, {'time': '0.011', 'sql': 'SELECT `entry`.`id`, <snip> FROM `entry` LIMIT 10 OFFSET 30'}]
Finally, the entries have been fetched.
It is. Django's pagination uses the same rules/optimizations that apply to querysets.
This means it will start evaluating on return HttpResponse(output)
In Django filter statement what's the difference if I write:
.filter(name__exact='Alex')
and
.filter(name='Alex')
Thanks
There is no difference, the second one implies using the __exact.
From the documentation:
For example, the following two statements are equivalent:
>>> Blog.objects.get(id__exact=14) # Explicit form
>>> Blog.objects.get(id=14)
# __exact is implied This is for convenience, because exact
# lookups are the common case.
You can look at the SQL that Django will execute by converting the queryset's query property to a string:
>>> from django.contrib.auth.models import User
>>> str(User.objects.filter(username = 'name').query)
'SELECT ... WHERE `auth_user`.`username` = name '
>>> str(User.objects.filter(username__exact = 'name').query)
'SELECT ... WHERE `auth_user`.`username` = name '
So __exact makes no difference here.
This is not exactly the same as the question but can be useful for some developers.
It depends on Django database and collation. I am using mysql db and ci(case insensitive) collation and have a strange result.
If there is User "Test" and query with space in the end
In : User.objects.filter(username__iexact="Test ")
Out : <QuerySet []>
In : User.objects.filter(username__exact="Test ")
Out : <QuerySet [<User: Test>]>
In : User.objects.filter(username="Test ")
Out : <QuerySet [<User: Test>]>
from django.test import TestCase
from user.factories import UserFactory
from user.models import User
class TestCaseSensitiveQueryTestCase(TestCase):
def setUp(self):
super().setUp()
self.QUERY = 'case sensitive username'
self.USERNAME = 'cAse SEnSItIVE UsErNAME'
self.user = UserFactory(name=self.USERNAME)
def test_implicit_exact_match(self):
with self.assertRaises(User.DoesNotExist):
User.objects.get(name=self.QUERY)
def test_explicit_iexact_match(self):
User.objects.get(name__iexact=self.QUERY)