Django - filter queryset disregarding spaces

Django - filter queryset disregarding spaces - regex

I have a model that contains phone numbers external_number, stored as a char field:
models.py
class PhoneRecord(models.Model):
def __unicode__(self):
return "Call to %s (%s)" % (self.external_number, self.call_date.strftime("%c"))
INBOUND = "I"
OUTBOUND = "O"
DIRECTION_CHOICES = (
(INBOUND, "Inbound"),
(OUTBOUND, "Outbound"),
)
external_number = models.CharField(max_length=20)
call_date = models.DateTimeField()
external_id = models.CharField(max_length=20)
call_duration = models.TimeField()
call_direction = models.CharField(max_length=1, choices=DIRECTION_CHOICES, default=INBOUND)
call = models.FileField(upload_to='calls/%Y/%m/%d')
The form is cleaning and storing the data using the UKPhoneNumberField from https://github.com/g1smd/django-localflavor-UK-forms/blob/master/django/localflavor/uk/forms.py
This means that the number is stored in the database in the format 01234 567890 i.e. with a space in.
I have created a filter using django-filters which works well, when searching for partial phone number except that it doesn't filter correctly when the search term doesn't include the space. i.e.
search for 01234 returns the record for the example above
search for 567890 returns the record for the example above
search for 01234 567890 returns the record for the example above
search for 01234567890 does not return the record for the example above
Now, I could subject the form on the filter to the same restrictions (i.e. UKPhoneNumberField as the input screen, but that then takes away the ability to search for partial phone numbers.
I have also explored the possibility of using something like django-phonenumber-field that will control both the model and the form, but the validation provided by UKPhoneNumberField allows me to validate based on the type of number entered (e.g. mobile or geographical).
What I would ideally like is either
My filter to ignore spaces that are either input by the user in their search query, or any spaces that are in the stored value. Is this even possible?
Apply the validation provided by UKPhoneNumberField to another field type without having to analyse and re-write all the regular expressions provided.
Some other UK Phone number validation I have not found yet!

You could setup the filter to do something like this:
phone_number = "01234 567890"
parts = phone_number.split(" ")
PhoneRecord.objects.filter(
external_number__startswith="{} ".format(parts[0]),
external_number__endswith=" {}".format(parts[1]),
)
That way the filter is looking for the first half of the number with the space and then the second half of the number with the space as well. The only records that would be returned would be ones that had the value of "01234 567890"

I ended up adding a custom filter with a field_class = UKPhoneFilter. This means I can't search on partial numbers, but that was removed as a requirement from the project.

You should think about normalizing the data before you save them into your DB.
You could just use django-phonenumber-field which does just that.
In regarding your problem you can always use django's regex field postfix to query over regular expression.
e.g. MyModel.objects.filter(myfiel__regex=r'[a-Z]+')

Related

Django ORM: filtering on concatinated fields

In my app, I have a document number which consists of several fields of Document model like:
{{doc_code}}{{doc_num}}-{{doc_year}}
doc_num is an integer in the model, but for the user, it is a five digits string, where empty spaces are filled by zero, like 00024, or 00573.
doc_year is a date field in the model, but in full document number, it is the two last digits of the year.
So for users, the document number is for example - TR123.00043-22.
I want to implement searching on the documents list page.
One approach is to autogenerate the full_number field from doc_code, doc_num and doc_year fields in the save method of Document model and filter on this full_number.
Anothe is to use Concat function before using of filter on query.
First by concatinate full_code field
docs = Document.annotate(full_code=Concat('doc_code', 'doc_num', Value('-'), 'doc_year', output_field=CharField()))
and than filter by full_code field
docs = docs.filter(full_code__icontain=keyword)
But how to pass doc_num as five digits string and doc_year as two last digits of year to Concat function?
Or what could be a better solution for this task?

Concat will only take field names and string values, so you don't really have many options there that I know of.
As you note, you can set an extra field on save. That's probably the best approach if you are going to be using it in multiple places.
The save function would look something ike
def save(self, *args, **kwargs):
super().save()
self.full_code = str(self.doc_code) + f"{doc_num:05d}") + '-' + time.strftime("%y", doc_year))
self.save()
doc_num requires python>= 3.6, other methods for earlier pythons can be seen here
doc_year assumes it is a datetime type. If it is just a four digit int then something like str(doc_year)[-2:] should work instead.
Alternately, if you are only ever going to use it rarely you could loop through your recordset adding an additional field
docs=Document.objects.all() #or whatever filter is appropriate
for doc in docs:
doc.full_code = f"{doc.doc_code}{doc.doc_num}-{time.strftime("%y", doc_year)}
#or f"{doc.doc_code}{doc.doc_num}-{str(doc_year)[-2:]} if doc_year not datetime
and then convert it to a list so you don't make another DB call and lose your new field, and filter it via list comprehension.
filtered_docs = [x for x in list(docs) if search_term in x.full_code]
pass filtered_docs to your template and away you go.

Filter multiple Django model fields with variable number of arguments

I'm implementing search functionality with an option of looking for a record by matching multiple tables and multiple fields in these tables.
Say I want to find a Customer by his/her first or last name, or by ID of placed Order which is stored in different model than Customer.
The easy scenario which I already implemented is that a user only types single word into search field, I then use Django Q to query Order model using direct field reference or related_query_name reference like:
result = Order.objects.filter(
Q(customer__first_name__icontains=user_input)
|Q(customer__last_name__icontains=user_input)
|Q(order_id__icontains=user_input)
).distinct()
Piece of a cake, no problems at all.
But what if user wants to narrow the search and types multiple words into search field.
Example: user has typed Bruce and got a whole lot of records back as a result of search.
Now he/she wants to be more specific and adds customer's last name to search.So the search becomes Bruce Wayne, after splitting this into separate parts I'm having Bruce and Wayne. Obviously I don't want to search Orders model because order_id is a single-word instance and it's sufficient to find customer at once so for this case I'm dropping it out of query at all.
Now I'm trying to match customer by both first AND last name, I also want to handle the scenario where the order of provided data is random, to properly handle Bruce Wayne and Wayne Bruce, meaning I still have customers full name but the position of first and last name aren't fixed.
And this is the question I'm looking answer for: how to build query that will search multiple fields of model not knowing which of search words belongs to which table.
I'm guessing the solution is trivial and there's for sure an elegant way to create such a dynamic query, but I can't think of a way how.

You can dynamically OR a variable number of Q objects together to achieve your desired search. The approach below makes it trivial to add or remove fields you want to include in the search.
from functools import reduce
from operator import or_
fields = (
'customer__first_name__icontains',
'customer__last_name__icontains',
'order_id__icontains'
)
parts = []
terms = ["Bruce", "Wayne"] # produce this from your search input field
for term in terms:
for field in fields:
parts.append(Q(**{field: term}))
query = reduce(or_, parts)
result = Order.objects.filter(query).distinct()
The use of reduce combines the Q objects by ORing them together. Credit to that part of the answer goes to this answer.

The solution I came up with is rather complex, but it works exactly the way I wanted to handle this problem:
search_keys = user_input.split()
if len(search_keys) > 1:
first_name_set = set()
last_name_set = set()
for key in search_keys:
first_name_set.add(Q(customer__first_name__icontains=key))
last_name_set.add(Q(customer__last_name__icontains=key))
query = reduce(and_, [reduce(or_, first_name_set), reduce(or_, last_name_set)])
else:
search_fields = [
Q(customer__first_name__icontains=user_input),
Q(customer__last_name__icontains=user_input),
Q(order_id__icontains=user_input),
]
query = reduce(or_, search_fields)
result = Order.objects.filter(query).distinct()

Django search objects by multiple keys

I am trying to prepare search form where user is able to type 1, 2 or all (3 in this case) search filters.
Lets say that search filters are:
last name, phone and address. I am trying to filter queryset by:
if filterForm.is_valid():
last_name = filterForm.cleaned_data.get('last_name')
phone= filterForm.cleaned_data.get('phone')
address = filterForm.cleaned_data.get('address')
if last_name is None and phone is None and address is None:
pass
#we dont do search id db
else:
clients = Client.objects.filter(Q(last_name__contains=last_name) | Q(phone=phone) | Q(address__contains=address))
Each search key may be blank.
Unfortunately, it returns more results then expected. When I type in search filter "Example" as last name field, it returns all fields with this last name + many others rows.
Any idea how to fix this search issue?

I believe that your search returns more results than expected when any of the search keys are blank since a blank key will match any row with a value.
By only filtering on keys that contains a value it should work better.
Here is one example of how it can be done:
if filterForm.is_valid():
last_name = filterForm.cleaned_data.get('last_name')
phone= filterForm.cleaned_data.get('phone')
address = filterForm.cleaned_data.get('address')
import operator
predicates = []
if last_name:
predicates.append(Q(last_name__contains=last_name))
if phone:
predicates.append(Q(phone=phone))
if address:
predicates.append(Q(address__contains=address))
if len(predicates) == 0:
# Nothing to search for
pass
else:
clients = Client.objects.filter(reduce(operator.or_, predicates))
The code above will dynamically add filters that shall be added to the query. The usage of oprator.or_ will concatenate the statements with OR (=at least one statement needs to be satisfied). If you instead want all statements to be satisfied you can use operator.and_ instead.

django filter depeding on character count of the related model's fields

I have two models such that
class Employer(models.Model):
code = models.CharField(null=False,blank=False,default="")
class JobTitle(models.Model):
employer = models.ForeignKey(Employer,unique=False,null=False,default=0)
name = models.CharField(max_length=1000,null=False,default="")
and I would like to get all employers whose jobtitle name is less than X chracters. How can I achive this in Django ?
Thanks

The correct code for this is
Employer.objects.filter(jobtitle__name__regex="^.{0,20}$")
This will select all the employers who have a job title name up to and including 20 characters long. Just replace the 20 with whatever number you need.
Note that if an Employer has multiple JobTitles whose name are less than 20 characters long, it will return that Employer in the list multiple times. If you don't want this to happen, you should add distinct() to the query as follows:
Employer.objects.filter(jobtitle__name__regex="^.{0,20}$").distinct()
You'll now only get the Employer back once, even if they have multiple short JobTitles.

try
Employer.objects.filter(jobtitle__name__regex="^\W{0, X}$")
When you use regex, you can filter records from database with provided regular expression. In this case all records with name which contains 0 to X signs will be returned

Word count query in Django

Given a model with both Boolean and TextField fields, I want to do a query that finds records that match some criteria AND have more than "n" words in the TextField. Is this possible? e..g.:
class Item(models.Model):
...
notes = models.TextField(blank=True,)
has_media = models.BooleanField(default=False)
completed = models.BooleanField(default=False)
...
This is easy:
items = Item.objects.filter(completed=True,has_media=True)
but how can I filter for a subset of those records where the "notes" field has more than, say, 25 words?

Try this:
Item.objects.extra(where=["LENGTH(notes) - LENGTH(REPLACE(notes, ' ', ''))+1 > %s"], params=[25])
This code uses Django's extra queryset method to add a custom WHERE clause. The calculation in the WHERE clause basically counts the occurances of the "space" character, assuming that all words are prefixed by exactly one space character. Adding one to the result accounts for the first word.
Of course, this calculation is only an approximation to the real word count, so if it has to be precise, I'd do the word count in Python.

I dont know what SQL need to be run in order for the DB to do the work, which is really what we want, but you can monkey-patch it.
Make an extra fields named wordcount or something, then extend the save method and make it count all the words in notes before saving the model.
The it is trivial to loop over and there is still no chance that this denormalization of data will break since the save method is always run on save.
But there might be a better way, but if all else fails, this is what I would do.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js