Django query case-insensitive list match - django

I have a list of names that I want to match case insensitive, is there a way to do it without using a loop like below?
a = ['name1', 'name2', 'name3']
result = any([Name.objects.filter(name__iexact=name) for name in a])

Unfortunatley, there are no __iin field lookup. But there is a iregex that might be useful, like so:
result = Name.objects.filter(name__iregex=r'(name1|name2|name3)')
or even:
a = ['name1', 'name2', 'name3']
result = Name.objects.filter(name__iregex=r'(' + '|'.join(a) + ')')
Note that if a can contain characters that are special in a regex, you need to escape them properly.
NEWS: In Django 1.7+ it is possible to create your own lookups, so you can actually use filter(name__iin=['name1', 'name2', 'name3']) after proper initialization. See documentation reference for details.

In Postgresql you could try creating a case insensitive index as described here:
https://stackoverflow.com/a/4124225/110274
Then run a query:
from django.db.models import Q
name_filter = Q()
for name in names:
name_filter |= Q(name__iexact=name)
result = Name.objects.filter(name_filter)
Index search will run faster than the regex matching query.

Another way to this using django query functions and annotation
from django.db.models.functions import Lower
Record.objects.annotate(name_lower=Lower('name')).filter(name_lower__in=['two', 'one']

Adding onto what Rasmuj said, escape any user-input like so
import re
result = Name.objects.filter(name__iregex=r'(' + '|'.join([re.escape(n) for n in a]) + ')')

Keep in mind that at least in MySQL you have to set utf8_bin collation in your tables to actually make them case sensitive. Otherwise they are case preserving but case insensitive. E.g.
>>> models.Person.objects.filter(first__in=['John', 'Ringo'])
[<Person: John Lennon>, <Person: Ringo Starr>]
>>> models.Person.objects.filter(first__in=['joHn', 'RiNgO'])
[<Person: John Lennon>, <Person: Ringo Starr>]
So, if portability is not crucial and you use MySQL you may choose to ignore the issue altogether.

I am expanding Exgeny idea into an two liner.
import functools
Name.objects.filter(functools.reduce(lambda acc,x: acc | Q(name_iexact=x)), names, Q()))

After trying many methods, including annotate, which resulted in duplicate objects, I discovered transformers (https://docs.djangoproject.com/en/4.1/howto/custom-lookups/#a-transformer-example) which allow for a simple solution.
Add the following to models.py before model declarations:
class LowerCase(models.Transform):
lookup_name = "lower"
function = "LOWER"
models.CharField.register_lookup(LowerCase)
models.TextField.register_lookup(LowerCase)
You can now use the __lower transformer alongside any lookup, in this case: field__lower__in. You can also add bilateral = True to the transformer class for it to apply to both the field and the list items, which should be functionally equivalent to __iin.

Here is an example of custom User model classmethod to filter users by email case-insensitive
from django.db.models import Q
#classmethod
def get_users_by_email_query(cls, emails):
q = Q()
for email in [email.strip() for email in emails]:
q = q | Q(email__iexact=email)
return cls.objects.filter(q)

If this is a common use case for anyone, you can implement this by adapting the code from Django's In and IExact transformers.
Make sure the following code is imported before all model declarations:
from django.db.models import Field
from django.db.models.lookups import In
#Field.register_lookup
class IIn(In):
lookup_name = 'iin'
def process_lhs(self, *args, **kwargs):
sql, params = super().process_lhs(*args, **kwargs)
# Convert LHS to lowercase
sql = f'LOWER({sql})'
return sql, params
def process_rhs(self, qn, connection):
rhs, params = super().process_rhs(qn, connection)
# Convert RHS to lowercase
params = tuple(p.lower() for p in params)
return rhs, params
Example usage:
result = Name.objects.filter(name__iin=['name1', 'name2', 'name3'])

Related

Django postgres HStoreField order by

Can I order the results of QuerySet by values inside of HStoreField, for example I've got model:
class Product(model.Models):
name = CharField(max_length=100)
properties = HStoreField()
And I want to store some properties of my product in HStoreField like:
{ 'discount': '10', 'color': 'white'}
In view I want to order the resulting QuerySet by discount.
The above answer does not work. Order transforms were never implemented for HStoreField, see https://code.djangoproject.com/ticket/24747.
But the suggestion in https://code.djangoproject.com/ticket/24592 works. Here is some more detail.
from django.contrib.gis.db.models import TextField, HStoreField, Model
from django.db.models import F, Func, Value
class MyThing(Model):
name: TextField()
keys: HStoreField()
things = [MyThing(name='foo'),
MyThing(name='bar'),
MyThing(name='baz')]
things[0].keys['movie'] = 'Jaws'
things[1].keys['movie'] = 'Psycho'
things[2].keys['movie'] = 'The Birds'
things[0].keys['rating'] = 5
things[1].keys['rating'] = 4
things[2].keys['year'] = '1963'
# Informal search
MyThing.objects\
.filter(keys__has_key='rating')\
.order_by(Func(F('keys'), Value('movie'),
function='',
arg_joiner=' -> ',
output_field=TextField()))
The formal search is exactly as described in the second link above. Use the imports in the above snippet with that code.
This might work (I cannot test right now):
.order_by("properties -> 'discount'")
But be aware that HSTORE values are strings, so you do not get numeric order but instead items are ordered as strings.
Do get proper numeric order, you should extract the properties->discount key as separate column and cast it to integer.

Django queryset: Exclude list of emails using endswith

I'm running metrics on user data and want exclude users that have bogus emails like '#example.com' or '#test.com'.
I tried
emails_to_exclude = ['#example.com', '#test.com', '#mailinator.com' ....]
Users.objects.exclude(email__endswith__in=emails_to_exclude)
Unfortunately this doesn't work. Looks like endswith and in don't play nice with each other. Any ideas?
Simply loop over the QuerySet, as QuerySets are lazy.
emails_to_exclude = ['#example.com', '#test.com', '#mailinator.com' ....]
users = Users.objects
for exclude_email in emails_to_exclude:
users = users.exclude(email__endswith=exclude_email)
users = users.all()
You can also do this with regular expressions in single query.
emails_to_exclude = ['#example.com', '#test.com', '#mailinator.com' ....]
User.objects.exclude(email__regex = "|".join(emails_to_exclude))
I don't know the efficiency of this query.
This will not work for SQLite, as it has no built in regular expression support.
You can probably loop over the emails and build up a Q Object. Actually, you can probably do a 1-liner if you're clever.
User.objects.exclude(bitwise_or_function[Q(email__endswith=e) for e in emails_to_exclude])
Something like that. I don't remember the function to bitwise-OR a whole list together, my Python's rusty.
This should works with the latest version of Python and Django. The reduce function is a good friend.
from functools import reduce
from operator import or_
from django.db.models import Q
emails_to_exclude = ['#example.com', '#test.com', '#mailinator.com' ....]
users = ( Users.objects
.exclude( reduce( or_, (
Q(( "email__endswith", k ))
for k in emails_to_exclude
) ) )
)
One more way to achieve this:
from django.contrib.auth.models import User
from django.db.models import Q
emails_to_exclude = ['#example.com', '#test.com', '#mailinator.com']
users = User.objects.all()
filters = Q()
for ending in emails_to_exclude:
filters |= Q(email__endswith=ending)
filtered_users = users.filter(~filters)
I changed the exclusion input to make it a set and to not have the "#". Otherwise, this should do what you want.
>>> emails = ['foo#example.com', 'spam#stackoverflow.com', 'bad#test.com']
>>> excludes = {'example.com', 'test.com', 'mailinator.com'}
>>> [email for email in emails if email.split('#')[-1] not in excludes]
['spam#stackoverflow.com']

django icontains with __in lookup

So I want to find any kind of matching given some fields, so for example, this is what I would like to do:
possible_merchants = ["amazon", "web", "services"]
# Possible name --> "Amazon Service"
Companies.objects.filter(name__icontains__in=possible_merchants)
sadly it is not possible to mix icontains and the __in lookup.
It seems to be a pretty complex query so if at least I could ignore case the name that would be enough, for example:
Companies.objects.filter(name__ignorecase__in=possible_merchants)
Any ideas?
P.D.: The queries I posted don't work, it's just a way to express what I need (just in case heh)
You can create querysets with the Q constructor and combine them with the | operator to get their union:
from django.db.models import Q
def companies_matching(merchants):
"""
Return a queryset for companies whose names contain case-insensitive
matches for any of the `merchants`.
"""
q = Q()
for merchant in merchants:
q |= Q(name__icontains = merchant)
return Companies.objects.filter(q)
(And similarly with iexact instead of icontains.)
I find it a cleaner approach using reduce and or_ operator:
from django.db.models import Q
from functools import reduce
from operator import or_
def get_companies_from_merchants(merchant_list):
q_object = reduce(or_, (Q(name__icontains=merchant) for merchant in merchant_list))
return Companies.objects.filter(q_object)
This would create a list of Q objects querying the name to contain a single element in merchant list. This would happpen for all the elements in merchant_list and all these Q objects would be reduced to a single Q object having mutliple ORs which can be directly applied to the filter query.
This is the approach that I adopted:
class MyManager(models.Manager):
def exclusive_in(self, lookup, value_list):
return self.filter(reduce(or_, (Q(**{lookup:_}) for _ in value_list)))
Here is now to use it:
Companies.objects.exclusive_in('name__icontains', possible_merchants])
It was inspired by other answers in this thread, as well as Django filter queryset __in for *every* item in list.
Another approach would be to simulate the actions that Django normally does for iexact queries (it converts both parts of the comparison statement to the upper case via SQL Upper function.
This way, the query will look like this:
Companies.objects.annotate(
upper_name=models.Upper("name")
).filter(
upper_name__in=[rchant.upper() for merchant in possible_merchants]
)

django queryset - searching for firstname and lastname

i have a django app that retrieve all subjects from a single table of users. i've also implemented an input search form,
this is the query performed:
all_soggs = Entity.objects.filter(lastname__istartswith=request.GET['query_term']).order_by('lastname')
if(all_soggs.count()==0):
all_soggs = Entity.objects.filter(firstname__istartswith=request.GET['query_term']).order_by('firstname')
as you can see the query first search for matching items by lastname, and then by firstname. this works until i insert the complete name 'firstaname lastname' or 'lastname firstname', in this case there's no results. how can i modify the query to make a better search?
thanks - luke
Copy/paste from: https://stackoverflow.com/a/17361729/1297812
from django.db.models import Q
def find_user_by_name(query_name):
qs = User.objects.all()
for term in query_name.split():
qs = qs.filter( Q(first_name__icontains = term) | Q(last_name__icontains = term))
return qs
You need Q objects and you also need to split your query into separate terms (since no first name will match the full string "Firstname Lastname").
Here's an idea to match any first or last name starting with either "Firstname" or "Lastname" in the search "Firstname Lastname".
This is a generic search - adjust the query to suit your specific needs!
Edit: oops, I really don't like using reduce since it looks confusing, but these need to be ORed together and we can't do a more verbose version because the number of terms is unknown.
import operator
from django.db.models import Q
search_args = []
for term in request.GET['query_term'].split():
for query in ('first_name__istartswith', 'last_name__istartswith'):
search_args.append(Q(**{query: term}))
all_soggs = Entity.objects.filter(reduce(operator.or_, search_args))
To clarify how to use Q objects, given the search "Firstname Lastname" the previous query is equal to:
Entity.objects.filter(
Q(first_name__istartswith="Firstname") | Q(last_name__istartswith="Firstname") |
Q(first_name__istartswith="Lastname") | Q(last_name__istartswith="Lastname")
)
This is a fairly old question but I just ran into the same problem and I thought I would share a more elegant solution.
from django.db.models import Value as V
from django.db.models.functions import Concat
from ..models import User
def find_user_by_name(search_str) -> QuerySey[User]:
q = User.objects.annotate(full_name=Concat('first_name', V(' '), 'last_name'))
q = q.filter(full_name__icontains=search_str)
return q
This is the only solution that I have found which lets gives the behaviour I wanted, IE seaching with a full name string with a space ("John Doe") and with a partial string ("John Do").
Similar question:Querying full name in Django
query = request.GET.get('query')
entities = []
try:
firstname = query.split(' ')[0]
lastname = query.split(' ')[1]
entities += Entity.objects.filter(firstname__icontains=firstname,lastname__icontains=lastname)
entities += Entity.objects.filter(firstname__icontains=lastname,lastname__icontains=firstname)
entities = set(entities)

How to filter empty or NULL names in a QuerySet?

I have first_name, last_name & alias (optional) which I need to search for. So, I need a query to give me all the names that have an alias set.
Only if I could do:
Name.objects.filter(alias!="")
So, what is the equivalent to the above?
You could do this:
Name.objects.exclude(alias__isnull=True)
If you need to exclude null values and empty strings, the preferred way to do so is to chain together the conditions like so:
Name.objects.exclude(alias__isnull=True).exclude(alias__exact='')
Chaining these methods together basically checks each condition independently: in the above example, we exclude rows where alias is either null or an empty string, so you get all Name objects that have a not-null, not-empty alias field. The generated SQL would look something like:
SELECT * FROM Name WHERE alias IS NOT NULL AND alias != ""
You can also pass multiple arguments to a single call to exclude, which would ensure that only objects that meet every condition get excluded:
Name.objects.exclude(some_field=True, other_field=True)
Here, rows in which some_field and other_field are true get excluded, so we get all rows where both fields are not true. The generated SQL code would look a little like this:
SELECT * FROM Name WHERE NOT (some_field = TRUE AND other_field = TRUE)
Alternatively, if your logic is more complex than that, you could use Django's Q objects:
from django.db.models import Q
Name.objects.exclude(Q(alias__isnull=True) | Q(alias__exact=''))
For more info see this page and this page in the Django docs.
As an aside: My SQL examples are just an analogy--the actual generated SQL code will probably look different. You'll get a deeper understanding of how Django queries work by actually looking at the SQL they generate.
Name.objects.filter(alias__gt='',alias__isnull=False)
Firstly, the Django docs strongly recommend not using NULL values for string-based fields such as CharField or TextField. Read the documentation for the explanation:
https://docs.djangoproject.com/en/dev/ref/models/fields/#null
Solution:
You can also chain together methods on QuerySets, I think. Try this:
Name.objects.exclude(alias__isnull=True).exclude(alias="")
That should give you the set you're looking for.
1. When using exclude, keep the following in mind to avoid common mistakes:
Should not add multiple conditions into an exclude() block like filter(). To exclude multiple conditions, you should use multiple exclude().
Example: (NOT a AND NOT b)
Entry.objects.exclude(title='').exclude(headline='')
equal to
SELECT... WHERE NOT title = '' AND NOT headline = ''
======================================================
2. Only use multiple when you really know about it:
Example: NOT (a AND b)
Entry.objects.exclude(title='', headline='')
equal to
SELECT.. WHERE NOT (title = '' AND headline = '')
If you want to exclude null (None), empty string (""), as well as a string containing white spaces (" "), you can use the __regex along with __isnull filter option
Name.objects.filter(
alias__isnull = False,
alias__regex = r"\S+"
)
alias__isnull=False excludes all the columns null columns
aliax__regex = r"\S+" makes sure that the column value contains at least one or more non whitespace characters.
From Django 1.8,
from django.db.models.functions import Length
Name.objects.annotate(alias_length=Length('alias')).filter(alias_length__gt=0)
You can simply do this:
Name.objects.exclude(alias="").exclude(alias=None)
It's really just that simple. filter is used to match and exclude is to match everything but what it specifies. This would evaluate into SQL as NOT alias='' AND alias IS NOT NULL.
Another approach using a generic isempty lookup, that can be used with any field.
It can also be used by django rest_framework or other apps that use django lookups:
from distutils.util import strtobool
from django.db.models import Field
from django.db.models.lookups import BuiltinLookup
#Field.register_lookup
class IsEmpty(BuiltinLookup):
lookup_name = 'isempty'
prepare_rhs = False
def as_sql(self, compiler, connection):
sql, params = compiler.compile(self.lhs)
condition = self.rhs if isinstance(self.rhs, bool) else bool(strtobool(self.rhs))
if condition:
return "%s IS NULL or %s = ''" % (sql, sql), params
else:
return "%s <> ''" % sql, params
You can then use it like this:
Name.objects.filter(alias__isempty=False)
this is another simple way to do it .
Name.objects.exclude(alias=None)