Django startswith on fields - django

let's say that I have an Address model with a postcode field. I can lookup addresses with postcode starting with "123" with this line:
Address.objects.filter(postcode__startswith="123")
Now, I need to do this search the "other way around". I have an Address model with a postcode_prefix field, and I need to retrieve all the addresses for which postcode_prefix is a prefix of a given code, like "12345". So if in my db I had 2 addresses with postcode_prefix = "123" and "234", only the first one would be returned.
Something like:
Address.objects.filter("12345".startswith(postcode_prefix))
The problem is that this doesn't work.
The only solution I can come up with is to perform a filter on the first char, like:
Address.objects.filter(postcode_prefix__startswith="12345"[0])
and then, when I get the results, make a list comprehension that filters them properly, like this:
results = [r for r in results if "12345".startswith(r.postcode_prefix)]
Is there a better way to do it in django?

Edit: This does not answer the original question but how to word a query the other way around.
I think what you are trying to do with your "something like" line is properly written as this:
Address.objects.filter(postcode__startswith=postcode_prefix)

In SQL terms, what you want to achieve reads like ('12345' is the postcode you are searching for):
SELECT *
FROM address
WHERE '12345' LIKE postcode_prefix||'%'
This is not really a standard query and I do not see any possibility to achieve this in Django using only get()/filter().
However, Django offers a way to provide additional SQL clauses with extra():
postcode = '12345'
Address.objects.extra(where=["%s LIKE postcode_prefix||'%%'"], params=[postcode])
Please see the Django documentation on extra() for further reference. Also note that the extra contains pure SQL, so you need to make sure that the clause is valid for your database.
Hope this works for you.

Bit of a mouthful but you can do this by annotating your search value and then filtering against it. All happens pretty quickly in-database.
from django.db.models import Value as V, F, CharField
Address.objects.exclude(
postcode_prefix=''
).annotate(
postcode=Value('12345', output_field=CharField())
).filter(
postcode__startswith=F('postcode_prefix')
)
The exclude is only necessary if postcode_prefix can be empty. This would result in an SQL like '%', which would match every postcode.
I'm sure you could do this via a nice templated function these days too... But this is clean enough for me.

A possible alternative. (Have no idea how it compares to the accepted solution with a column as the second param to like, in execution time)
q=reduce(lambda a,b:a|b, [Q(postcode__startswith=postcode[:i+1]) for i in range(len(postcode))])
Thus, you generate all prefixes, and or them together...

The raw SQL query that would do that you need looks something like this:
select * from postal_code_table where '1234567' like postal_code||'%'
This query will select any postal_code from your table that is a substring of '1234567' and also must start from begining, ie: '123', '1234', etc.
Now to implement this in Django, the preferred method is using a custom look up:
from django.db.models.fields import Field
from django.db.models import Lookup
#Field.register_lookup
class LowerStartswithContainedBy(Lookup):
'''Postgres LIKE query statement'''
lookup_name = 'istartswithcontainedby'
def as_sql(self, compiler, connection):
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + rhs_params
return f"LOWER({rhs}) LIKE LOWER({lhs}) || '%%'", params
Now you can write a django query such as the following:
PostCode.objects.filter(code__istartswithcontainedby='1234567')
Similarly, if you are just looking for substring and do not require the startswith condition, simply modify the return line of as_sql method to the following:
return f"LOWER({rhs}) LIKE '%%' || LOWER({lhs}) || '%%'", params
For more detailed explanation, see my git gist Django custom lookup

A. If not the issue https://code.djangoproject.com/ticket/13363,
you could do this:
queryset.extra(select={'myconst': "'this superstring is myconst value'"}).filter(myconst__contains=F('myfield'))
Maybe, they will fix an issue and it can work.
B. If not the issue 16731 (sorry not providing full url, not enough rep, see another ticket above) you could filter by fields that added with '.annotate', with creation of custom aggreation function, like here:
http://coder.cl/2011/09/custom-aggregates-on-django/
C. Last and successful. I have managed to do this using monkeypatching of the following:
django.db.models.sql.Query.query_terms
django.db.models.fields.Field.get_prep_lookup
django.db.models.fields.Field.get_db_prep_lookup
django.db.models.sql.where.WhereNode.make_atom
Just defined custom lookup '_starts', which has reverse logic of '_startswith'

Related

How to create custom db model function in Django like `Greatest`

I have a scenario that, i want a greatest value with the field name. I can get greatest value using Greatest db function which django provides. but i am not able to get its field name. for example:
emps = Employee.objects.annotate(my_max_value=Greatest('date_time_field_1', 'date_time_field_1'))
for e in emps:
print(e.my_max_value)
here i will get the value using e.my_max_value but i am unable to find out the field name of that value
You have to annotate a Conditional Expression using Case() and When().
from django.db.models import F, Case, When
emps = Employee.objects.annotate(
greatest_field=Case(
When(datetime_field_1__gt=F("datetime_field_2"),
then="datetime_field_1"),
When(datetime_field_2__gt=F("datetime_field_1"),
then="datetime_field_2"),
default="equal",
)
)
for e in emps:
print(e.greatest_field)
If you want the database query to tell you which of the fields was larger, you'll need to add another annotated column, using case/when logic to return one field name or the other. (See https://docs.djangoproject.com/en/4.0/ref/models/conditional-expressions/#when)
Unless you're really trying to offload work onto the database, it'll be much simpler to do the comparison work in Python.

How to get boolean result in annotate django?

I have a filter which should return a queryset with 2 objects, and should have one different field. for example:
obj_1 = (name='John', age='23', is_fielder=True)
obj_2 = (name='John', age='23', is_fielder=False)
Both the objects are of same model, but different primary key. I tried usign the below filter:
qs = Model.objects.filter(name='John', age='23').annotate(is_fielder=F('plays__outdoor_game_role')=='Fielder')
I used annotate first time, but it gave me the below error:
TypeError: QuerySet.annotate() received non-expression(s): False.
I am new to Django, so what am I doing wrong, and what should be the annotate to get the required objects as shown above?
The solution by #ktowen works well, quite straightforward.
Here is another solution I am using, hope it is helpful too.
queryset = queryset.annotate(is_fielder=ExpressionWrapper(
Q(plays__outdoor_game_role='Fielder'),
output_field=BooleanField(),
),)
Here are some explanations for those who are not familiar with Django ORM:
Annotate make a new column/field on the fly, in this case, is_fielder. This means you do not have a field named is_fielder in your model while you can use it like plays.outdor_game_role.is_fielder after you add this 'annotation'. Annotate is extremely useful and flexible, can be combined with almost every other expression, should be a MUST-KNOWN method in Django ORM.
ExpressionWrapper basically gives you space to wrap a more complecated combination of conditions, use in a format like ExpressionWrapper(expression, output_field). It is useful when you are combining different types of fields or want to specify an output type since Django cannot tell automatically.
Q object is a frequently used expression to specify a condition, I think the most powerful part is that it is possible to chain the conditions:
AND (&): filter(Q(condition1) & Q(condition2))
OR (|): filter(Q(condition1) | Q(condition2))
Negative(~): filter(~Q(condition))
It is possible to use Q with normal conditions like below:
(Q(condition1)|id__in=[list])
The point is Q object must come to the first or it will not work.
Case When(then) can be simply explained as if con1 elif con2 elif con3 .... It is quite powerful and personally, I love to use this to customize an ordering object for a queryset.
For example, you need to return a queryset of watch history items, and those must be in an order of watching by the user. You can do it with for loop to keep the order but this will generate plenty of similar queries. A more elegant way with Case When would be:
item_ids = [list]
ordering = Case(*[When(pk=pk, then=pos)
for pos, pk in enumerate(item_ids)])
watch_history = Item.objects.filter(id__in=item_ids)\
.order_by(ordering)
As you can see, by using Case When(then) it is possible to bind those very concrete relations, which could be considered as 1) a pinpoint/precise condition expression and 2) especially useful in a sequential multiple conditions case.
You can use Case/When with annotate
from django.db.models import Case, BooleanField, Value, When
Model.objects.filter(name='John', age='23').annotate(
is_fielder=Case(
When(plays__outdoor_game_role='Fielder', then=Value(True)),
default=Value(False),
output_field=BooleanField(),
),
)

Django Array contains a field

I am using Django, with mongoengine. I have a model Classes with an inscriptions list, And I want to get the docs that have an id in that list.
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
Here's a general explanation of querying ArrayField membership:
Per the Django ArrayField docs, the __contains operator checks if a provided array is a subset of the values in the ArrayField.
So, to filter on whether an ArrayField contains the value "foo", you pass in a length 1 array containing the value you're looking for, like this:
# matches rows where myarrayfield is something like ['foo','bar']
Customer.objects.filter(myarrayfield__contains=['foo'])
The Django ORM produces the #> postgres operator, as you can see by printing the query:
print Customer.objects.filter(myarrayfield__contains=['foo']).only('pk').query
>>> SELECT "website_customer"."id" FROM "website_customer" WHERE "website_customer"."myarrayfield_" #> ['foo']::varchar(100)[]
If you provide something other than an array, you'll get a cryptic error like DataError: malformed array literal: "foo" DETAIL: Array value must start with "{" or dimension information.
Perhaps I'm missing something...but it seems that you should be using .filter():
classes = Classes.objects.filter(inscriptions__contains=request.data['inscription'])
This answer is in reference to your comment for rnevius answer
In Django ORM whenever you make a Database call using ORM, it will generally return either a QuerySet or an object of the model if using get() / number if you are using count() ect., depending on the functions that you are using which return other than a queryset.
The result from a Queryset function can be used to implement further more refinement, like if you like to perform a order() or collecting only distinct() etc. Queryset are lazy which means it only hits the database when they are actually used not when they are assigned. You can find more information about them here.
Where as the functions that doesn't return queryset cannot implement such things.
Take time and go through the Queryset Documentation more in depth explanation with examples are provided. It is useful to understand the behavior to make your application more efficient.

django's .extra(where= clauses are clobbered by table-renaming .filter(foo__in=... subselects

The short of it is, the table names of all queries that are inside a filter get renamed to u0, u1, ..., so my extra where clauses won't know what table to point to. I would love to not have to hand-make all the queries for every way I might subselect on this data, and my current workaround is to turn my extra'd queries into pk values_lists, but those are really slow and something of an abomination.
Here's what this all looks like. You can mostly ignore the details of what goes in the extra of this manager method, except the first sql line which points to products_product.id:
def by_status(self, *statii):
return self.extra(where=["""products_product.id IN
(SELECT recent.product_id
FROM (
SELECT product_id, MAX(start_date) AS latest
FROM products_productstatus
GROUP BY product_id
) AS recent
JOIN products_productstatus AS ps ON ps.product_id = recent.product_id
WHERE ps.start_date = recent.latest
AND ps.status IN (%s))""" % (', '.join([str(stat) for stat in statii]),)])
Which works wonderfully for all the situations involving only the products_product table.
When I want these products as a subselect, i do:
Piece.objects.filter(
product__in=Product.objects.filter(
pk__in=list(
Product.objects.by_status(FEATURED).values_list('id', flat=True))))
How can I keep the generalized abilities of a query set, yet still use an extra where clause?
At first: the issue is not totally clear to me. Is the second code block in your question the actual code you want to execute? If this is the case the query should work as expected since there is no subselect performed.
I assume so that you want to use the second code block without the list() around the subselect to prevent a second query being performed.
The django documentation refers to this issue in the documentation about the extra method. However its not very easy to overcome this issue.
The easiest but most "hakish" solution is to observe which table alias is produced by django for the table you want to query in the extra method. You can rely on the persistent naming of this alias as long as you construct the query always in the same fashion (you don't change the order of multiple extra methods or filter calls that cause a join).
You can inspect a query that will be execute in the DB queryset by using:
print Model.objects.filter(...).query
This will reveal the aliases that are used for the tables you want to query.
As of Django 1.11, you should be able to use Subquery and OuterRef to generate an equivalent query to your extra (using a correlated subquery rather than a join):
def by_status(self, *statii):
return self.filter(
id__in=Subquery(ProductStatus.values("product_id").filter(
status__in=statii,
product__in=Subquery(ProductStatus.objects.values(
"product_id",
).annotate(
latest=Max("start_date"),
).filter(
latest=OuterRef("start_date"),
).values("product_id"),
),
)
You could probably do it with Window expressions as well (as of Django 2.0).
Note that this is untested, so may need some tweaks.

How to filter empty or NULL names in a QuerySet?

I have first_name, last_name & alias (optional) which I need to search for. So, I need a query to give me all the names that have an alias set.
Only if I could do:
Name.objects.filter(alias!="")
So, what is the equivalent to the above?
You could do this:
Name.objects.exclude(alias__isnull=True)
If you need to exclude null values and empty strings, the preferred way to do so is to chain together the conditions like so:
Name.objects.exclude(alias__isnull=True).exclude(alias__exact='')
Chaining these methods together basically checks each condition independently: in the above example, we exclude rows where alias is either null or an empty string, so you get all Name objects that have a not-null, not-empty alias field. The generated SQL would look something like:
SELECT * FROM Name WHERE alias IS NOT NULL AND alias != ""
You can also pass multiple arguments to a single call to exclude, which would ensure that only objects that meet every condition get excluded:
Name.objects.exclude(some_field=True, other_field=True)
Here, rows in which some_field and other_field are true get excluded, so we get all rows where both fields are not true. The generated SQL code would look a little like this:
SELECT * FROM Name WHERE NOT (some_field = TRUE AND other_field = TRUE)
Alternatively, if your logic is more complex than that, you could use Django's Q objects:
from django.db.models import Q
Name.objects.exclude(Q(alias__isnull=True) | Q(alias__exact=''))
For more info see this page and this page in the Django docs.
As an aside: My SQL examples are just an analogy--the actual generated SQL code will probably look different. You'll get a deeper understanding of how Django queries work by actually looking at the SQL they generate.
Name.objects.filter(alias__gt='',alias__isnull=False)
Firstly, the Django docs strongly recommend not using NULL values for string-based fields such as CharField or TextField. Read the documentation for the explanation:
https://docs.djangoproject.com/en/dev/ref/models/fields/#null
Solution:
You can also chain together methods on QuerySets, I think. Try this:
Name.objects.exclude(alias__isnull=True).exclude(alias="")
That should give you the set you're looking for.
1. When using exclude, keep the following in mind to avoid common mistakes:
Should not add multiple conditions into an exclude() block like filter(). To exclude multiple conditions, you should use multiple exclude().
Example: (NOT a AND NOT b)
Entry.objects.exclude(title='').exclude(headline='')
equal to
SELECT... WHERE NOT title = '' AND NOT headline = ''
======================================================
2. Only use multiple when you really know about it:
Example: NOT (a AND b)
Entry.objects.exclude(title='', headline='')
equal to
SELECT.. WHERE NOT (title = '' AND headline = '')
If you want to exclude null (None), empty string (""), as well as a string containing white spaces (" "), you can use the __regex along with __isnull filter option
Name.objects.filter(
alias__isnull = False,
alias__regex = r"\S+"
)
alias__isnull=False excludes all the columns null columns
aliax__regex = r"\S+" makes sure that the column value contains at least one or more non whitespace characters.
From Django 1.8,
from django.db.models.functions import Length
Name.objects.annotate(alias_length=Length('alias')).filter(alias_length__gt=0)
You can simply do this:
Name.objects.exclude(alias="").exclude(alias=None)
It's really just that simple. filter is used to match and exclude is to match everything but what it specifies. This would evaluate into SQL as NOT alias='' AND alias IS NOT NULL.
Another approach using a generic isempty lookup, that can be used with any field.
It can also be used by django rest_framework or other apps that use django lookups:
from distutils.util import strtobool
from django.db.models import Field
from django.db.models.lookups import BuiltinLookup
#Field.register_lookup
class IsEmpty(BuiltinLookup):
lookup_name = 'isempty'
prepare_rhs = False
def as_sql(self, compiler, connection):
sql, params = compiler.compile(self.lhs)
condition = self.rhs if isinstance(self.rhs, bool) else bool(strtobool(self.rhs))
if condition:
return "%s IS NULL or %s = ''" % (sql, sql), params
else:
return "%s <> ''" % sql, params
You can then use it like this:
Name.objects.filter(alias__isempty=False)
this is another simple way to do it .
Name.objects.exclude(alias=None)