You can print a queryset's SQL as follows:
print str(queryset.query)
however, for some reason this removes quotation marks, so you get:
SELECT `tableA`.`fieldA` FROM `fieldA` WHERE `tableA`.`fieldB` = Foo
instead of:
SELECT `tableA`.`fieldA` FROM `fieldA` WHERE `tableA`.`fieldB` = "Foo"
notice the missing ""
How can this be corrected?
If the underlying database is PostgreSQL you can do:
from django.db import connection
sql, params = queryset.query.sql_with_params()
cursor = connection.cursor()
cursor.mogrify(sql, params)
sql_with_params returns the plain query without any values substituted and the parameters that will be inserted in the query.
It is still not recommended to use .mogrify() for other purposes than debugging because the method may disappear in the future.
If you want to execute the query, you can/should just use .raw().
YourModel.objects.raw(sql, params)
not quite what you want, but if you have DEBUG = True you can use
from django.db import connection
connection.queries
update:
looking at the Queryset __str__ method:
__str__(self)
| Returns the query as a string of SQL with the parameter values
| substituted in.
|
| Parameter values won't necessarily be quoted correctly, since that is
| done by the database interface at execution time.
If this is for debug purpose you should look into django-debug-toolbar that will show you all queries ran for any view you're looking at
Related
My Goal
I need PostgreSQL's rank() window function applied to an annotated queryset from Django's ORM. Django's sql query has to be a subquery in order to apply the window function and this is what I'm doing so far:
queryset = Item.objects.annotate(…)
queryset_with_rank = Items.objects.raw("""
select rank() over (order by points), *
from (%(subquery)s)""", { 'subquery': queryset.query }
)
The problem
Unfortunately, the query returned by queryset.query does not quote the parameters used for annotation correctly although the query itself is executed perfectly fine.
Example of returned query
The query returned by queryset_with_rank.query or queryset.query returns the following
"participation"."category" = )
"participation"."category" = amateur)
which I rather expected to be
"participation"."category" = '')
"participation"."category" = 'amateur')
Question
I noticed that the Django documentation states the following about Query.__str__()
Parameter values won't necessarily be quoted correctly, since that is done by the database interface at execution time.
As long as I fix the quotation manually and pass it to Postgres myself, everything works as expected. Is there a way to receive the needed subquery with correct quotation? Or is there an alternative and better approach to applying a window function to a Django ORM queryset altoghether?
As Django core developer Aymeric Augustin said, there's no way to get the exact query that is executed by the database backend beforehand.
I still managed to build the query the way I hoped to, although a bit cumbersome:
# Obtain query and parameters separately
query, params = item_queryset.query.sql_with_params()
# Put additional quotes around string. I guess this is what
# the database adapter does as well.
params = [
'\'{}\''.format(p)
if isinstance(p, basestring) else p
for p in params
]
# Cast list of parameters to tuple because I got
# "not enough format characters" otherwise. Dunno why.
params = tuple(params)
participations = Item.objects.raw("""
select *,
rank() over (order by points DESC) as rank
from ({subquery}
""".format(subquery=query.format(params)), []
)
my models are designed like so
class Warehouse:
name = ...
sublocation = FK(Sublocation)
class Sublocation:
name = ...
city = FK(City)
class City:
name = ..
state = Fk(State)
Now if i throw a query.
wh = Warehouse.objects.value_list(['name', 'sublocation__name',
'sublocation__city__name']).first()
it returns correct result but internally how many query is it throwing? is django fetching the data in one request?
Django makes only one query to the database for getting the data you described.
When you do:
wh = Warehouse.objects.values_list(
'name', 'sublocation__name', 'sublocation__city__name').first()
It translates in to this query:
SELECT "myapp_warehouse"."name", "myapp_sublocation"."name", "myapp_city"."name"
FROM "myapp_warehouse" INNER JOIN "myapp_sublocation"
ON ("myapp_warehouse"."sublocation_id" = "myapp_sublocation"."id")
INNER JOIN "myapp_city" ON ("myapp_sublocation"."city_id" = "myapp_city"."id")'
It gets the result in a single query. You can count number of queries in your shell like this:
from django.db import connection as c, reset_queries as rq
In [42]: rq()
In [43]: len(c.queries)
Out[43]: 0
In [44]: wh = Warehouse.objects.values_list('name', 'sublocation__name', 'sublocation__city__name').first()
In [45]: len(c.queries)
Out[45]: 1
My suggestion would be to write a test for this using assertNumQueries (docs here).
from django.test import TestCase
from yourproject.models import Warehouse
class TestQueries(TestCase):
def test_query_num(self):
"""
Assert values_list query executes 1 database query
"""
values = ['name', 'sublocation__name', 'sublocation__city__name']
with self.assertNumQueries(1):
Warehouse.objects.value_list(values).first()
FYI I'm not sure how many queries are indeed sent to the database, 1 is my current best guess. Adjust the number of queries expected to get this to pass in your project and pin the requirement.
There is extensive documentation on how and when querysets are evaluated in Django docs: QuerySet API Reference.
The pretty much standard way to have a good insight of how many and which queries are taken place during a page render is to use the Django Debug Toolbar. This could tell you precisely how many times this recordset is evaluated.
You can use django-debug-toolbar to see real queries to db
Is it possible to make query with a collation different from database table have?
Using extra() is a little messy. Something similar can now be achieved with Func() expression (since Django 1.8):
username_ci = Func(
'username',
function='utf8_general_ci',
template='(%(expressions)s) COLLATE "%(function)s"')
This can be used in annotate():
User.objects.annotate(uname_ci=username_ci).filter(uname_ci='joeblow').exists()
Or in order_by() to override default collation rules when sorting:
User.objects.order_by(username_ci)
Now, it still may seem messy, but if you look at the docs and code of Func(), you will discover that it is very easy to subclass it and make a reusable collation setter.
I used this trick with Postgres database.
Here is how you can use a specific collation instead of the default collation for a given table/column. I'm assuming you always want that to be the case insensitive utf8_general_ci, but you can easily change that in the code or add it as a variable.
Note the use of the params kwarg instead of the db literal function. Params exists for the exact same purpose.
def iexact(**kw):
fields = [['%s=%%s collate utf8_general_ci'%field,value] for (field,value) in kw.items()]
return dict(where=[f[0] for f in fields], params=[f[1] for f in fields])
if User.objects.extra(**iexact(username='joeblow')).exists():
status = "Found a user with this username!"
I solve this using bit of a hack;
Django's extra method is just like raw method, they both using the query statetment directly;
MyModel.objects.extra(where=["name LIKE '%%" + name + "%%' COLLATE utf8_general_ci"])
But like this sql injection is possible. We need to escape name variable. I searched a lot for a function which just escapes a string for db. Found one in MySQL-python package but it can't escape unicode strings. Also package has literal method in connection but to use it we need an instance (maybe it is for db characteristic).
At last I used Django's db.connection.cursor.
from django.db import connection
cursor = connection.cursor()
name = cursor.db.connection.literal(name)[1:-1] # [1:-1] excluding quotes
With this way we also need an instance but I suppose this not require a db connection. And I suppose this method db independent. If I am wrong please correct me.
This above solution works. In case of getting the reverse order the following snippet
sort_value = sort.strip()
if sort_value in ['name', '-name']:
sort = Func('name', function='C', template='(%(expressions)s) COLLATE "%(function)s"')
if sort_value in ['-name']:
f_res = queryset.order_by(sort).reverse()
else:
f_res = queryset.order_by(sort)
return f_res
For a Queryset in Django, we can call its method .query to get the raw sql.
for example,
queryset = AModel.objects.all()
print queryset.query
the output could be: SELECT "id", ... FROM "amodel"
But for retrieving a object by "get", say,
item = AModel.objects.get(id = 100)
how to get the equivalent raw sql? Notice: the item might be None.
The item = AModel.objects.get(id = 100) equals to
items = AModel.objects.filter(id = 100)
if len(items) == 1:
return items[0]
else:
raise exception
Thus the executed query equals to AModel.objects.filter(id = 100)
Also, you could check the latest item of connection.queries
from django.db import connection # use connections for non-default dbs
print connection.queries[-1]
And, as FoxMaSk said, install django-debug-toolbar and enjoy it in your browser.
It's the same SQL, just with a WHERE id=100 clause tacked to the end.
However, FWIW, If a filter is specific enough to only return one result, it's the same SQL as get would produce, the only difference is on the Python side at that point, e.g.
AModel.objects.get(id=100)
is the same as:
AModel.objects.filter(id=100).get()
So, you can simply query AModel.objects.filter(id=100) and then use queryset.query with that.
if it's just for debugging purpose you can use "the django debug bar" which can be installed by
pip install django-debug-toolbar
In a django view, I need to append string data to the end of an existing text column in my database. So, for example, say I have a table named "ATable", and it has a field named "aField". I'd like to be able to append a string to the end of "aField" in a race-condition-free way. Initially, I had this:
tableEntry = ATable.objects.get(id=100)
tableEntry.aField += aStringVar
tableEntry.save()
The problem is that if this is being executed concurrently, both can get the same "tableEntry", then they each independently update, and the last one to "save" wins, losing the data appended by the other.
I looked into this a bit and found this, which I hoped would work, using an F expression:
ATable.objects.filter(id=100).update(aField=F('aField') + aStringVar)
The problem here, is I get an SQL error, saying:
operator does not exist: text + unknown
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
Tried changing to "str(aStringVar)" even though its already a string - no luck.. I found a couple django bug reports complaining about similar issues, but I didn't see a fix or a workaround. Is there some way I can cast aStringVar such that it can be appended to the text of the F expression? BTW - also tried "str(F('aField')) + aStringVar" but that converted the result of the F expression to the string "(DEFAULT: )".
You can use the Concat db function.
from django.db.models import Value
from django.db.models.functions import Concat
ATable.objects.filter(id=100).update(some_field=Concat('some_field', Value('more string')))
In my case, I am adding a suffix for facebook avatars URIs like this:
FACEBOOK_URI = 'graph.facebook.com'
FACEBOOK_LARGE = '?type=large'
# ...
users = User.objects.filter(Q(avatar_uri__icontains=FACEBOOK_URI) & ~Q(avatar_uri__icontains=FACEBOOK_LARGE))
users.update(avatar_uri=Concat('avatar_uri', Value(FACEBOOK_LARGE)))
and I get SQL like this (Django 1.9):
UPDATE `user_user` SET `avatar_uri` = CONCAT(COALESCE(`user_user`.`avatar_uri`, ''), COALESCE('?type=large', ''))
WHERE (`user_user`.`avatar_uri` LIKE '%graph.facebook.com%' AND NOT (`user_user`.`avatar_uri` LIKE '%?type=large%' AND `user_user`.`avatar_uri` IS NOT NULL))
The result is all image URIs were changed from http://graph.facebook.com/<fb user id>/picture to http://graph.facebook.com/<fb user id>/picture?type=large
You can override F object in Django with one simple change:
class CF(F):
ADD = '||'
Then just use CF in place of F. It will place "||" instead of "+" when generating SQL. For example, the query:
User.objects.filter(pk=100).update(email=CF('username') + '#gmail.com')
will generate the SQL:
UPDATE "auth_user" SET "email" = "auth_user"."username" || '#gmail.com'
WHERE "auth_user"."id" = 100
And if you get this running, it isn't thread safe. While your update is running, some other process can update a model not knowing the data in the database is updated.
You have too acquire a lock, but don't forget this senario:
Django: m = Model.objects.all()[10]
Django: m.field = field
Django: a progress which takes a while (time.sleep(100))
DB: Lock table
DB: Update field
DD: Unlock table
Django: the slow process is finished
Django: m.save()
Now the field update became undone by the model instance in Django (Ghost write)
You can achieve this functionality with Django's select_for_update() operator: https://docs.djangoproject.com/en/dev/ref/models/querysets/#select-for-update
Something like this:
obj = ATable.objects.select_for_update().get(id=100)
obj.aField = obj.aField + aStringVar
obj.save()
The table row will be locked when you call .select_for_update().get(), and the lock will be released when you call .save(), allowing you to perform the operation atomically.
seems you can't do this. however, what you are trying to do could be solved using transactions
(looks like you are using postgres, so if you want to do it in one query and use raw sql as suggested, || is the concatenation operator you want)