Is it possible to make query with a collation different from database table have?
Using extra() is a little messy. Something similar can now be achieved with Func() expression (since Django 1.8):
username_ci = Func(
'username',
function='utf8_general_ci',
template='(%(expressions)s) COLLATE "%(function)s"')
This can be used in annotate():
User.objects.annotate(uname_ci=username_ci).filter(uname_ci='joeblow').exists()
Or in order_by() to override default collation rules when sorting:
User.objects.order_by(username_ci)
Now, it still may seem messy, but if you look at the docs and code of Func(), you will discover that it is very easy to subclass it and make a reusable collation setter.
I used this trick with Postgres database.
Here is how you can use a specific collation instead of the default collation for a given table/column. I'm assuming you always want that to be the case insensitive utf8_general_ci, but you can easily change that in the code or add it as a variable.
Note the use of the params kwarg instead of the db literal function. Params exists for the exact same purpose.
def iexact(**kw):
fields = [['%s=%%s collate utf8_general_ci'%field,value] for (field,value) in kw.items()]
return dict(where=[f[0] for f in fields], params=[f[1] for f in fields])
if User.objects.extra(**iexact(username='joeblow')).exists():
status = "Found a user with this username!"
I solve this using bit of a hack;
Django's extra method is just like raw method, they both using the query statetment directly;
MyModel.objects.extra(where=["name LIKE '%%" + name + "%%' COLLATE utf8_general_ci"])
But like this sql injection is possible. We need to escape name variable. I searched a lot for a function which just escapes a string for db. Found one in MySQL-python package but it can't escape unicode strings. Also package has literal method in connection but to use it we need an instance (maybe it is for db characteristic).
At last I used Django's db.connection.cursor.
from django.db import connection
cursor = connection.cursor()
name = cursor.db.connection.literal(name)[1:-1] # [1:-1] excluding quotes
With this way we also need an instance but I suppose this not require a db connection. And I suppose this method db independent. If I am wrong please correct me.
This above solution works. In case of getting the reverse order the following snippet
sort_value = sort.strip()
if sort_value in ['name', '-name']:
sort = Func('name', function='C', template='(%(expressions)s) COLLATE "%(function)s"')
if sort_value in ['-name']:
f_res = queryset.order_by(sort).reverse()
else:
f_res = queryset.order_by(sort)
return f_res
Related
I need to make a get query like:
obj = Current.objects.get(Code='M01.C0001')
But the query giving "Multiple Objects Returned' error because of the database has another record with similar unicode string 'M01.Ç0001'
[<obj: M01.Ç0001>, <obj: M01.C0001>]
I try to fetch data with field lookup functions, but it does not work anyway.
I googled around but I didn't find a way to temporarily set the Collation for this query.
Is it possible to temporarily set collation during executing a get query in Django 1.3?
SOLUTION:
I solved my problem with using raw django query with adding COLLATE to sql string.
obj = Current.objects.raw("SELECT * FROM Current WHERE Code = 'M01.C0001' COLLATE utf8_bin;")
Collation is a database property, so you cannot do that.
Change collation to database.
I have a simple query on django's built in comments model and getting the error below with heroku's postgreSQL database:
DatabaseError: operator does not exist: integer = text LINE 1:
... INNER JOIN "django_comments" ON ("pi ns_pin"."id" = "django_...
^
HINT: No operator matches the given name and argument type(s).
You might need to add explicit type casts.
After googling around it seems this error has been addressed many times before in django, but I'm still getting it (all related issues were closed 3-5 years ago) . I am using django version 1.4 and the latest build of tastypie.
The query is made under orm filters and works perfectly with my development database (sqlite3):
class MyResource(ModelResource):
comments = fields.ToManyField('my.api.api.CmntResource', 'comments', full=True, null=True)
def build_filters(self, filters=None):
if filters is None:
filters = {}
orm_filters = super(MyResource, self).build_filters(filters)
if 'cmnts' in filters:
orm_filters['comments__user__id__exact'] = filters['cmnts']
class CmntResource(ModelResource):
user = fields.ToOneField('my.api.api.UserResource', 'user', full=True)
site_id = fields.CharField(attribute = 'site_id')
content_object = GenericForeignKeyField({
My: MyResource,
}, 'content_object')
username = fields.CharField(attribute = 'user__username', null=True)
user_id = fields.CharField(attribute = 'user__id', null=True)
Anybody have any experience with getting around this error without writing raw SQL?
PostgreSQL is "strongly typed" - that is, every value in every query has a particular type, either defined explicitly (e.g. the type of a column in a table) or implicitly (e.g. the values input into a WHERE clause). All functions and operators, including =, have to be defined as accepting specific types - so, for instance there is an operator for VarChar = VarChar, and a different one for int = int.
In your case, you have a column which is explicitly defined as type int, but you are comparing it against a value which PostgreSQL has interpreted as type text.
SQLite, on the other hand, is "weakly typed" - values are freely treated as being of whatever type best suits the action being performed. So in your dev SQLite database the operation '42' = 42 can be computed just fine, where PostgreSQL would need a specific definition of VarChar = int (or text = int, text being the type for unbounded strings in PostgreSQL).
Now, PostgreSQL will sometimes be helpful and automatically "cast" your values to make the types match a known operator, but more often, as the hint says, you need to do it explicitly. If you were writing the SQL yourself, an explicit type case could look like WHERE id = CAST('42' AS INT) (or WHERE CAST(id AS text) = '42').
Since you're not, you need to ensure that the input you give to the query generator is an actual integer, not just a string which happens to consist of digits. I suspect this is as simple as using fields.IntegerField rather than fields.CharField, but I don't actually know Django, or even Python, so I thought I'd give you the background in the hope you can take it from there.
Building on IMSoP's answer: This is a limitation of django's ORM layer when a Generic foreign key uses a text field for the object_id and the object's id field is not a text field. Django does not want to make any assumptions or cast the object's id as something it's not. I found an excellent article on this http://charlesleifer.com/blog/working-around-django-s-orm-to-do-interesting-things-with-gfks/.
The author of the article, Charles Leifer came up with a very cool solution for query's that are affected by this and will be very useful in dealing with this issue moving forward.
Alternatively, i managed to get my query to work as follows:
if 'cmnts' in filters:
comments = Comment.objects.filter(user__id=filters['cmnts'], content_type__name = 'my', site_id=settings.SITE_ID ).values_list('object_pk', flat=True)
comments = [int(c) for c in comments]
orm_filters['pk__in'] = comments
Originally i was searching for a way to modify the SQL similar to what Charles has done, but it turns out all i had to do was break the query out into two parts and convert the str(id)'s to int(id)'s.
To do not hack you ORM and external software postgres allow you register your own casts and compare operations. Please look example in similar question.
let's say that I have an Address model with a postcode field. I can lookup addresses with postcode starting with "123" with this line:
Address.objects.filter(postcode__startswith="123")
Now, I need to do this search the "other way around". I have an Address model with a postcode_prefix field, and I need to retrieve all the addresses for which postcode_prefix is a prefix of a given code, like "12345". So if in my db I had 2 addresses with postcode_prefix = "123" and "234", only the first one would be returned.
Something like:
Address.objects.filter("12345".startswith(postcode_prefix))
The problem is that this doesn't work.
The only solution I can come up with is to perform a filter on the first char, like:
Address.objects.filter(postcode_prefix__startswith="12345"[0])
and then, when I get the results, make a list comprehension that filters them properly, like this:
results = [r for r in results if "12345".startswith(r.postcode_prefix)]
Is there a better way to do it in django?
Edit: This does not answer the original question but how to word a query the other way around.
I think what you are trying to do with your "something like" line is properly written as this:
Address.objects.filter(postcode__startswith=postcode_prefix)
In SQL terms, what you want to achieve reads like ('12345' is the postcode you are searching for):
SELECT *
FROM address
WHERE '12345' LIKE postcode_prefix||'%'
This is not really a standard query and I do not see any possibility to achieve this in Django using only get()/filter().
However, Django offers a way to provide additional SQL clauses with extra():
postcode = '12345'
Address.objects.extra(where=["%s LIKE postcode_prefix||'%%'"], params=[postcode])
Please see the Django documentation on extra() for further reference. Also note that the extra contains pure SQL, so you need to make sure that the clause is valid for your database.
Hope this works for you.
Bit of a mouthful but you can do this by annotating your search value and then filtering against it. All happens pretty quickly in-database.
from django.db.models import Value as V, F, CharField
Address.objects.exclude(
postcode_prefix=''
).annotate(
postcode=Value('12345', output_field=CharField())
).filter(
postcode__startswith=F('postcode_prefix')
)
The exclude is only necessary if postcode_prefix can be empty. This would result in an SQL like '%', which would match every postcode.
I'm sure you could do this via a nice templated function these days too... But this is clean enough for me.
A possible alternative. (Have no idea how it compares to the accepted solution with a column as the second param to like, in execution time)
q=reduce(lambda a,b:a|b, [Q(postcode__startswith=postcode[:i+1]) for i in range(len(postcode))])
Thus, you generate all prefixes, and or them together...
The raw SQL query that would do that you need looks something like this:
select * from postal_code_table where '1234567' like postal_code||'%'
This query will select any postal_code from your table that is a substring of '1234567' and also must start from begining, ie: '123', '1234', etc.
Now to implement this in Django, the preferred method is using a custom look up:
from django.db.models.fields import Field
from django.db.models import Lookup
#Field.register_lookup
class LowerStartswithContainedBy(Lookup):
'''Postgres LIKE query statement'''
lookup_name = 'istartswithcontainedby'
def as_sql(self, compiler, connection):
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + rhs_params
return f"LOWER({rhs}) LIKE LOWER({lhs}) || '%%'", params
Now you can write a django query such as the following:
PostCode.objects.filter(code__istartswithcontainedby='1234567')
Similarly, if you are just looking for substring and do not require the startswith condition, simply modify the return line of as_sql method to the following:
return f"LOWER({rhs}) LIKE '%%' || LOWER({lhs}) || '%%'", params
For more detailed explanation, see my git gist Django custom lookup
A. If not the issue https://code.djangoproject.com/ticket/13363,
you could do this:
queryset.extra(select={'myconst': "'this superstring is myconst value'"}).filter(myconst__contains=F('myfield'))
Maybe, they will fix an issue and it can work.
B. If not the issue 16731 (sorry not providing full url, not enough rep, see another ticket above) you could filter by fields that added with '.annotate', with creation of custom aggreation function, like here:
http://coder.cl/2011/09/custom-aggregates-on-django/
C. Last and successful. I have managed to do this using monkeypatching of the following:
django.db.models.sql.Query.query_terms
django.db.models.fields.Field.get_prep_lookup
django.db.models.fields.Field.get_db_prep_lookup
django.db.models.sql.where.WhereNode.make_atom
Just defined custom lookup '_starts', which has reverse logic of '_startswith'
I'm trying to create a custom field which would automatically add COLLATE information into the WHERE part of SQL query:
class IgnoreDiacriticsField(models.TextField):
def get_prep_lookup(self, lookup_type, value):
if lookup_type == 'exact':
return ' "' + self.get_prep_value(value) + '" COLLATE utf8_general_ci'
when I perform a query like this:
result = ModelClass.objects.filter(field='value')
then nothing is found, even though the query (print result.query) is valid and matches several rows. Am I doing something wrong?
The reason why I'm adding the collation iformation is that I want perform queries on those fields and ignore any diacritics.
Are you using MySQL 1.2.1p2 by any chance? From the Django documentation
If you're using MySQLdb 1.2.1p2, Django's standard CharField class
will return unicode strings even with utf8_bin collation. However,
TextField fields will be returned as an array.array instance (from
Python's standard array module). There isn't a lot Django can do about
that, since, again, the information needed to make the necessary
conversions isn't available when the data is read in from the
database. This problem was fixed in MySQLdb 1.2.2, so if you want to
use TextField with utf8_bin collation, upgrading to version 1.2.2 and
then dealing with the bytestrings (which shouldn't be too difficult)
as described above is the recommended solution.
I have first_name, last_name & alias (optional) which I need to search for. So, I need a query to give me all the names that have an alias set.
Only if I could do:
Name.objects.filter(alias!="")
So, what is the equivalent to the above?
You could do this:
Name.objects.exclude(alias__isnull=True)
If you need to exclude null values and empty strings, the preferred way to do so is to chain together the conditions like so:
Name.objects.exclude(alias__isnull=True).exclude(alias__exact='')
Chaining these methods together basically checks each condition independently: in the above example, we exclude rows where alias is either null or an empty string, so you get all Name objects that have a not-null, not-empty alias field. The generated SQL would look something like:
SELECT * FROM Name WHERE alias IS NOT NULL AND alias != ""
You can also pass multiple arguments to a single call to exclude, which would ensure that only objects that meet every condition get excluded:
Name.objects.exclude(some_field=True, other_field=True)
Here, rows in which some_field and other_field are true get excluded, so we get all rows where both fields are not true. The generated SQL code would look a little like this:
SELECT * FROM Name WHERE NOT (some_field = TRUE AND other_field = TRUE)
Alternatively, if your logic is more complex than that, you could use Django's Q objects:
from django.db.models import Q
Name.objects.exclude(Q(alias__isnull=True) | Q(alias__exact=''))
For more info see this page and this page in the Django docs.
As an aside: My SQL examples are just an analogy--the actual generated SQL code will probably look different. You'll get a deeper understanding of how Django queries work by actually looking at the SQL they generate.
Name.objects.filter(alias__gt='',alias__isnull=False)
Firstly, the Django docs strongly recommend not using NULL values for string-based fields such as CharField or TextField. Read the documentation for the explanation:
https://docs.djangoproject.com/en/dev/ref/models/fields/#null
Solution:
You can also chain together methods on QuerySets, I think. Try this:
Name.objects.exclude(alias__isnull=True).exclude(alias="")
That should give you the set you're looking for.
1. When using exclude, keep the following in mind to avoid common mistakes:
Should not add multiple conditions into an exclude() block like filter(). To exclude multiple conditions, you should use multiple exclude().
Example: (NOT a AND NOT b)
Entry.objects.exclude(title='').exclude(headline='')
equal to
SELECT... WHERE NOT title = '' AND NOT headline = ''
======================================================
2. Only use multiple when you really know about it:
Example: NOT (a AND b)
Entry.objects.exclude(title='', headline='')
equal to
SELECT.. WHERE NOT (title = '' AND headline = '')
If you want to exclude null (None), empty string (""), as well as a string containing white spaces (" "), you can use the __regex along with __isnull filter option
Name.objects.filter(
alias__isnull = False,
alias__regex = r"\S+"
)
alias__isnull=False excludes all the columns null columns
aliax__regex = r"\S+" makes sure that the column value contains at least one or more non whitespace characters.
From Django 1.8,
from django.db.models.functions import Length
Name.objects.annotate(alias_length=Length('alias')).filter(alias_length__gt=0)
You can simply do this:
Name.objects.exclude(alias="").exclude(alias=None)
It's really just that simple. filter is used to match and exclude is to match everything but what it specifies. This would evaluate into SQL as NOT alias='' AND alias IS NOT NULL.
Another approach using a generic isempty lookup, that can be used with any field.
It can also be used by django rest_framework or other apps that use django lookups:
from distutils.util import strtobool
from django.db.models import Field
from django.db.models.lookups import BuiltinLookup
#Field.register_lookup
class IsEmpty(BuiltinLookup):
lookup_name = 'isempty'
prepare_rhs = False
def as_sql(self, compiler, connection):
sql, params = compiler.compile(self.lhs)
condition = self.rhs if isinstance(self.rhs, bool) else bool(strtobool(self.rhs))
if condition:
return "%s IS NULL or %s = ''" % (sql, sql), params
else:
return "%s <> ''" % sql, params
You can then use it like this:
Name.objects.filter(alias__isempty=False)
this is another simple way to do it .
Name.objects.exclude(alias=None)