Django advanced LIKE filtering - django

So I'm trying to find a nice way to execute an advanced filter using the LIKE statement in Django.
Let's say I have the following records in a table called elements:
id = 1, name = 'group[1].car[8]'
id = 2, name = 'group[1].car[9]'
id = 3, name = 'group[1].truck[1]'
id = 4, name = 'group[1].car[10]'
id = 4, name = 'group[1].carVendor[1]'
I would like to select all elements that look like group[x].car[y].
To query this in SQL I would do:
SELECT * FROM elements WHERE name LIKE 'group[%].car[%]'
Now, by reading the Django documentation here, I see that the only pre-built LIKE statements are the following:
contains: name LIKE '%something%'
startswith: name LIKE 'something%'
endswith: name LIKE '%something'
So the one I need is missing:
plain like: name LIKE 'group[%].car[%]'
I'm also using Django Rest Framework to write up my API endpoints and also here we find the possibility to use:
contains: name__contains = something
startswith: name__startswith = something
endswith: name__endswith = something
So also here, the one I need is missing:
plain like: name__like 'group[%].car[%]'
Of course I know I can write a raw sql query through Django using the raw() method, but I would like to use this option if no better solution comes up, because:
I need to make sure my customization is safe
I need to extends the customization to DRF
Can anybody think of a way to help me out with this in a way to go with the flow with both Django and Django Rest Framework?

You can use a regular expression (regex) [wiki] for this, with the __iregex lookup [Django-doc]:
Elements.objects.filter(name__iregex=r'^group\[.*\].car\[.*\]$')
if between the square brackets, only digits are allowed, we can make it more specific with:
# only digits between the square brackets
Elements.objects.filter(name__iregex=r'^group\[\d*\].car\[\d*\]$')
Since some the specifications are a bit "complex" it is better to first test your regex, for example with regex101 you can look what names will be matched, and which will not.

Related

django split data and apply search istartswith = query

I have a Project and when searching a query I need to split the data (not search query) in to words and apply searching.
for example:
my query is : 'bot' (typing 'bottle')
but if I use meta_keywords__icontains = query the filter will also return queries with 'robot'.
Here meta_keywords are keywords that can be used for searching.
I won't be able to access data if the data in meta_keywords is 'water bottle' when I use meta_keywords__istartswith is there any way I can use in this case.
what I just need is search in every words of data with just istartswith
I can simply create a model for 'meta_keywords' and use the current data to assign values by splitting and saving as different data. I know it might be the best way. I need some other ways to achieve it.
You can search the name field with each word that istartswith in variable query.
import re
instances = Model.objects.filter(Q(name__iregex=r'[[:<:]]' + re.escape(query)))
Eg: Hello world can be searched using the query 'hello' and 'world'. It don't check the icontains
note: It works only in Python3

Django postgress - dynamic SearchQuery object creation

I have a app that lets the user search a database of +/- 100,000 documents for keywords / sentences.
I am using Django 1.11 and the Postgres FullTextSearch features described in the documentation
However, I am running into the following problem and I was wondering if someone knows a solution:
I want to create a SearchQuery object for each word in the supplied queryset like so:
query typed in by the user in the input field: ['term1' , 'term2', 'term3']
query = SearchQuery('term1') | SearchQuery('term2') | SearchQuery('term3')
vector = SearchVector('text')
Document.objects.annotate(rank=SearchRank(vector, query)).order_by('-rank').annotate(similarity=TrigramSimilarity(vector, query).filter(simularity__gt=0.3).order_by('-simularity')
The problem is that I used 3 terms for my query in the example, but I want that number to be dynamic. A user could also supply 1, or 10 terms, but I do not know how to add the relevant code to the query assignment.
I briefly thought about having the program write something like this to an empty document:
for query in terms:
file.write(' | (SearchQuery( %s )' % query ))
But having a python program writing python code seems like a very convoluted solution. Does anyone know a better way to achieve this?
Ive never used it, but to do a dynamic query you can just loop and add.
compound_statement = SearchQuery(list_of_words[0])
for term in list_of_words[1:]:
compound_statement = compound_statement | SearchQuery(term)
But the documentation tells us that
By default, all the words the user provides are passed through the stemming algorithms, and then it looks for matches for all of the resulting terms.
are you sure you need this?

In querying a collection in a Mongo database with Mongoose, how can I find where a value is LIKE a query term?

I've search hard for an answer to this but haven't found anything that works. I have a NodeJS app, with the Mongoose ORM. I'm trying to query my Mongo database where a result is LIKE the query.
I have tried using a new RegExp to find the results, but it hasn't worked for me. The only time I get a result is when the query is exactly the same as the collection property's value.
Here's what I'm using right now:
var query = "Some Query String.";
var q = new RegExp('^/.*'+ query +'.*/i$');
Quote.find({author: q}, function(err, doc){
cb(doc);
});
If the value of an author property contains something LIKE the query (for instance: 'some. query String'), I need to return the results. Perhaps stripping case, and excluding special characters is all I can do? What is the best way to do this? My RegEx in this example is obviously not working. Thanks!
You likely want to create your RegExp as follows instead as you don't include the / chars when using new RegExp:
var q = new RegExp(query, 'i');
I don't know of a way to ignore periods in the author properties of your docs with a RegExp though. You may want to look at $text queries for more flexible searching like that.

Django startswith on fields

let's say that I have an Address model with a postcode field. I can lookup addresses with postcode starting with "123" with this line:
Address.objects.filter(postcode__startswith="123")
Now, I need to do this search the "other way around". I have an Address model with a postcode_prefix field, and I need to retrieve all the addresses for which postcode_prefix is a prefix of a given code, like "12345". So if in my db I had 2 addresses with postcode_prefix = "123" and "234", only the first one would be returned.
Something like:
Address.objects.filter("12345".startswith(postcode_prefix))
The problem is that this doesn't work.
The only solution I can come up with is to perform a filter on the first char, like:
Address.objects.filter(postcode_prefix__startswith="12345"[0])
and then, when I get the results, make a list comprehension that filters them properly, like this:
results = [r for r in results if "12345".startswith(r.postcode_prefix)]
Is there a better way to do it in django?
Edit: This does not answer the original question but how to word a query the other way around.
I think what you are trying to do with your "something like" line is properly written as this:
Address.objects.filter(postcode__startswith=postcode_prefix)
In SQL terms, what you want to achieve reads like ('12345' is the postcode you are searching for):
SELECT *
FROM address
WHERE '12345' LIKE postcode_prefix||'%'
This is not really a standard query and I do not see any possibility to achieve this in Django using only get()/filter().
However, Django offers a way to provide additional SQL clauses with extra():
postcode = '12345'
Address.objects.extra(where=["%s LIKE postcode_prefix||'%%'"], params=[postcode])
Please see the Django documentation on extra() for further reference. Also note that the extra contains pure SQL, so you need to make sure that the clause is valid for your database.
Hope this works for you.
Bit of a mouthful but you can do this by annotating your search value and then filtering against it. All happens pretty quickly in-database.
from django.db.models import Value as V, F, CharField
Address.objects.exclude(
postcode_prefix=''
).annotate(
postcode=Value('12345', output_field=CharField())
).filter(
postcode__startswith=F('postcode_prefix')
)
The exclude is only necessary if postcode_prefix can be empty. This would result in an SQL like '%', which would match every postcode.
I'm sure you could do this via a nice templated function these days too... But this is clean enough for me.
A possible alternative. (Have no idea how it compares to the accepted solution with a column as the second param to like, in execution time)
q=reduce(lambda a,b:a|b, [Q(postcode__startswith=postcode[:i+1]) for i in range(len(postcode))])
Thus, you generate all prefixes, and or them together...
The raw SQL query that would do that you need looks something like this:
select * from postal_code_table where '1234567' like postal_code||'%'
This query will select any postal_code from your table that is a substring of '1234567' and also must start from begining, ie: '123', '1234', etc.
Now to implement this in Django, the preferred method is using a custom look up:
from django.db.models.fields import Field
from django.db.models import Lookup
#Field.register_lookup
class LowerStartswithContainedBy(Lookup):
'''Postgres LIKE query statement'''
lookup_name = 'istartswithcontainedby'
def as_sql(self, compiler, connection):
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + rhs_params
return f"LOWER({rhs}) LIKE LOWER({lhs}) || '%%'", params
Now you can write a django query such as the following:
PostCode.objects.filter(code__istartswithcontainedby='1234567')
Similarly, if you are just looking for substring and do not require the startswith condition, simply modify the return line of as_sql method to the following:
return f"LOWER({rhs}) LIKE '%%' || LOWER({lhs}) || '%%'", params
For more detailed explanation, see my git gist Django custom lookup
A. If not the issue https://code.djangoproject.com/ticket/13363,
you could do this:
queryset.extra(select={'myconst': "'this superstring is myconst value'"}).filter(myconst__contains=F('myfield'))
Maybe, they will fix an issue and it can work.
B. If not the issue 16731 (sorry not providing full url, not enough rep, see another ticket above) you could filter by fields that added with '.annotate', with creation of custom aggreation function, like here:
http://coder.cl/2011/09/custom-aggregates-on-django/
C. Last and successful. I have managed to do this using monkeypatching of the following:
django.db.models.sql.Query.query_terms
django.db.models.fields.Field.get_prep_lookup
django.db.models.fields.Field.get_db_prep_lookup
django.db.models.sql.where.WhereNode.make_atom
Just defined custom lookup '_starts', which has reverse logic of '_startswith'

Django select rows with duplicate field values for specific foreign key

again I would like to search for duplicates in my models, but now slightly different case.
Here are my models:
class Concept(models.Model):
main_name = models.ForeignKey(Literal)
...
class Literal(models.Model):
name = models.Charfield(...)
concept = models.ForeignKey(Concept)
...
And now the task I'm trying to achieve:
Select all literals that are NOT main_names, that have the same name for the same concept.
For example if I have literals:
[{id:1, name:'test', concept:1}, {id:2, name:'test', concept:1}]
and concepts:
[{id:1, main_name:1}]
Then in result I should get literal with the ID=2.
It sounds to me as though you want to execute a SQL query something like this:
SELECT l1.* FROM myapp_literal AS l1,
myapp_literal AS l2
WHERE l1.id <> l2.id
AND l1.name = l2.name
AND l1.concept = l2.concept
AND l1.id NOT IN (SELECT main_name FROM myapp_concept)
GROUP BY l1.id
Well, in cases where the query is too complex to easily express in Django's query language, you can always ask Django to do a raw SQL query—and this may be one of those cases.
If I understand your question you want:
All Literal objects that are not ForeignKey'd to Concept.
From that set, select those where the name and the concept is the same.
If so, I think this should work:
For the first part:
q = Literal.objects.exclude(pk__in=Concept.objects.values_list('id', flat=True))
EDIT:
Based on excellent feedback from Jan, I think for #2 you would need to use raw SQL.