Looking up value in JSONField with unaccent and icontains

Looking up value in JSONField with unaccent and icontains - django

I have a Model with a JSONField:
class MyModel(models.Model):
locale_names = models.JSONField()
The shape of the JSON Field is simple: keys are language codes (en, fr...) and values are translated strings.
I'm trying to build a search query that does an unaccented icontains search on a translated value:
MyModel.objects.filter(locale_names__en__unaccent__icontains="Test")
This does not give the expected results, because Django interprets "unaccent" as a key to look up in the JSON, rather than the unaccent PostgreSQL function:
-- Expected SQL query: something like
SELECT "app_model"."*" ...
FROM "app_model"
WHERE UPPER(UNACCENT("app_model"."locale_names" ->>'en')::text)) LIKE UPPER(UNACCENT('%Test%'))
LIMIT 21
-- Actual SQL query
SELECT "app_model"."*" ...
FROM "app_model"
WHERE UPPER(("app_model"."locale_names" #>> ARRAY['en','unaccent'])::text) LIKE UPPER('%Test%')
LIMIT 21
How can I tel Django to interpret __unaccent as the PostgreSQL function rather than a JSON path?
EDIT:
I'm using Django 3.2
Doing __unaccent__icontains lookups on regular CharFields works as expected.

Unfortunately, JSONField does not support unaccent lookup.
cf. documentation :
The unaccent lookup can be used on CharField and TextField:

As a complement to #Benbb96's answer above, my workaround was to write the WHERE clause I needed using the soon-to-be-deprecated QuerySet.extra method:
MyModel.objects.extra(
where=[
"UPPER(UNACCENT((app_model.locale_names->>'en')::text)) LIKE UPPER(UNACCENT(%s))"
],
params=("Test",)
)
As requested by the Django team, I created a ticket with them so that this use case can be addressed without QuerySet.extra().

Related

Include wildcard (%) in Django 'contains' query

According to Django docs
Entry.objects.get(headline__contains='Lennon')
Roughly translates to this SQL:
SELECT ... WHERE headline LIKE '%Lennon%';
But if I want to do somethng like this (removing a wildcard):
SELECT ... WHERE headline LIKE '%Lennon';
What would the Django query be?

The keywords for partial field lookups you are looking for are startswith and endswith:
Entry.objects.filter(headline__startswith='Lennon')
Entry.objects.filter(headline__endswith='Lennon')
You can also use the case insensitive variants, istartswith and iendswith:
Entry.objects.filter(headline__istartswith='lennon')
Entry.objects.filter(headline__iendswith='lennon')

Is there a way to add "Collation" in to Django 1.3 query?

I need to make a get query like:
obj = Current.objects.get(Code='M01.C0001')
But the query giving "Multiple Objects Returned' error because of the database has another record with similar unicode string 'M01.Ç0001'
[<obj: M01.Ç0001>, <obj: M01.C0001>]
I try to fetch data with field lookup functions, but it does not work anyway.
I googled around but I didn't find a way to temporarily set the Collation for this query.
Is it possible to temporarily set collation during executing a get query in Django 1.3?
SOLUTION:
I solved my problem with using raw django query with adding COLLATE to sql string.
obj = Current.objects.raw("SELECT * FROM Current WHERE Code = 'M01.C0001' COLLATE utf8_bin;")

Collation is a database property, so you cannot do that.
Change collation to database.

How to use a tsvector field to perform ranking in Django with postgresql full-text search?

I need to perform a ranking query using postgresql full-text search feature and Django with django.contrib.postgres module.
According to the doc, it is quite easy to do this using the SearchRank class by doing the following:
>>> from django.contrib.postgres.search import SearchQuery, SearchRank, SearchVector
>>> vector = SearchVector('body_text')
>>> query = SearchQuery('cheese')
>>> Entry.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank')
This probably works well but this is not exactly what I want since I have a field in my table which already contains tsvectorized data that I would like to use (instead of recomputing tsvector at each search query).
Unforunately, I can't figure out how to provide this tsvector field to the SearchRank class instead of a SearchVector object on a raw data field.
Is anyone able to indicate how to deal with this?
Edit:
Of course, simply trying to instantiate a SearchVector from the tsvector field does not work and fails with this error (approximately since I translated it from french):
django.db.utils.ProgrammingError: ERROR: function to_tsvector(tsvector) does not exist

If your model has a SearchVectorField like so:
from django.contrib.postgres.search import SearchVectorField
class Entry(models.Model):
...
search_vector = SearchVectorField()
you would use the F expression:
from django.db.models import F
...
Entry.objects.annotate(
rank=SearchRank(F('search_vector'), query)
).order_by('-rank')

I've been seeing mixed answers here on SO and in the official documentation. F Expressions aren't used in the documentation for this. However it may just be that the documentation doesn't actually provide an example for using SearchRank with a SearchVectorField.
Looking at the output of .explain(analyze=True) :
Without the F Expression:
Sort Key: (ts_rank(to_tsvector(COALESCE((search_vector)::text, ''::text))
When the F Expression is used:
Sort Key: (ts_rank(search_vector, ...)
In my experience, it seems the only difference between using an F Expression and the field name in quotes is that using the F Expression returns much faster, but is sometimes less accurate - depending on how you structure the query - it can be useful to enforce it with a COALESCE in some cases. In my case it's about a 3-5x speedboost to use the F Expression with my SearchVectorField.
Ensuring your SearchQuery has a config kwarg also improves things dramatically.

Query Django's HStoreField values using LIKE

I have a model with some HStoreField attributes and I can't seem to use Django's ORM HStoreField to query those values using LIKE.
When doing Model.objects.filter(hstoreattr__values__contains=['text']), the queryset only contains rows in which hstoreattr has any value that matches text exactly.
What I'm looking for is a way to search by, say, te instead of text and those same rows be returned as well. I'm aware this is possible in a raw PostgreSQL query but I'm looking for a solution that uses Django ORM.

If you want to check value of particular key in every object if it contains 'te', you can do:
Model.objects.filter(hstoreattr__your_key__icontains='te')
If you want to check if any key in your hstore field contains 'te', you will need to create your own lookup in django, because by default django won't do such thing. Refer to custom lookups in django docs for more info.

As far as I can remember, you cannot filter in values. If you want to filter in values, you have to pass a column and value you are referencing to. When you want it to be case insensitive use __icontains.
Although you cannot filter by all values, you can filter by all keys. Just like you showed in your code.
If you want to search for 'text' in all objects in key named let's say 'fo' - just do smth like this:
Model.objects.filter(hstoreattr__icontains={'fo': 'text'})

Django custom field - automatically add COLLATE to query

I'm trying to create a custom field which would automatically add COLLATE information into the WHERE part of SQL query:
class IgnoreDiacriticsField(models.TextField):
def get_prep_lookup(self, lookup_type, value):
if lookup_type == 'exact':
return ' "' + self.get_prep_value(value) + '" COLLATE utf8_general_ci'
when I perform a query like this:
result = ModelClass.objects.filter(field='value')
then nothing is found, even though the query (print result.query) is valid and matches several rows. Am I doing something wrong?
The reason why I'm adding the collation iformation is that I want perform queries on those fields and ignore any diacritics.

Are you using MySQL 1.2.1p2 by any chance? From the Django documentation
If you're using MySQLdb 1.2.1p2, Django's standard CharField class
will return unicode strings even with utf8_bin collation. However,
TextField fields will be returned as an array.array instance (from
Python's standard array module). There isn't a lot Django can do about
that, since, again, the information needed to make the necessary
conversions isn't available when the data is read in from the
database. This problem was fixed in MySQLdb 1.2.2, so if you want to
use TextField with utf8_bin collation, upgrading to version 1.2.2 and
then dealing with the bytestrings (which shouldn't be too difficult)
as described above is the recommended solution.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Looking up value in JSONField with unaccent and icontains - django

Unfortunately, JSONField does not support unaccent lookup. cf. documentation : The unaccent lookup can be used on CharField and TextField:

Related

Include wildcard (%) in Django 'contains' query

Is there a way to add "Collation" in to Django 1.3 query?

How to use a tsvector field to perform ranking in Django with postgresql full-text search?

Query Django's HStoreField values using LIKE

Django custom field - automatically add COLLATE to query

Categories

Resources