Django ORM: filtering on concatinated fields - django

In my app, I have a document number which consists of several fields of Document model like:
{{doc_code}}{{doc_num}}-{{doc_year}}
doc_num is an integer in the model, but for the user, it is a five digits string, where empty spaces are filled by zero, like 00024, or 00573.
doc_year is a date field in the model, but in full document number, it is the two last digits of the year.
So for users, the document number is for example - TR123.00043-22.
I want to implement searching on the documents list page.
One approach is to autogenerate the full_number field from doc_code, doc_num and doc_year fields in the save method of Document model and filter on this full_number.
Anothe is to use Concat function before using of filter on query.
First by concatinate full_code field
docs = Document.annotate(full_code=Concat('doc_code', 'doc_num', Value('-'), 'doc_year', output_field=CharField()))
and than filter by full_code field
docs = docs.filter(full_code__icontain=keyword)
But how to pass doc_num as five digits string and doc_year as two last digits of year to Concat function?
Or what could be a better solution for this task?

Concat will only take field names and string values, so you don't really have many options there that I know of.
As you note, you can set an extra field on save. That's probably the best approach if you are going to be using it in multiple places.
The save function would look something ike
def save(self, *args, **kwargs):
super().save()
self.full_code = str(self.doc_code) + f"{doc_num:05d}") + '-' + time.strftime("%y", doc_year))
self.save()
doc_num requires python>= 3.6, other methods for earlier pythons can be seen here
doc_year assumes it is a datetime type. If it is just a four digit int then something like str(doc_year)[-2:] should work instead.
Alternately, if you are only ever going to use it rarely you could loop through your recordset adding an additional field
docs=Document.objects.all() #or whatever filter is appropriate
for doc in docs:
doc.full_code = f"{doc.doc_code}{doc.doc_num}-{time.strftime("%y", doc_year)}
#or f"{doc.doc_code}{doc.doc_num}-{str(doc_year)[-2:]} if doc_year not datetime
and then convert it to a list so you don't make another DB call and lose your new field, and filter it via list comprehension.
filtered_docs = [x for x in list(docs) if search_term in x.full_code]
pass filtered_docs to your template and away you go.

Related

How can i filter a column in Django using .filter?

I have the following line:
test = MyModel.objects.filter(user=request.user)
The problem with this line is that it will retrieve the whole row. What if i want to retrieve the data from a certain column? For example, instead of the whole row, i'm trying to retrieve the column email
I tried this, but it's not working: email = test.email
You can use .values_list('email', flat=True) [Django-doc], like:
test = MyModel.objects.filter(user=request.user).values_list('email', flat=True)
Then test is a QuerySet that wraps strings. But usually this is not good software design. Usually one retrieves and stores User objects, not column values. A model can add a lot of extra logic that prevents that certain values are stored, and it sometimes contains extra logic to validate, clean, etc.
If a User has a single MyModel, you can just use .get(..) instead, like:
the_email = MyModel.objects.get(user=user).email
or with .values_list:
the_email = MyModel.objects.values_list('email', flat=True).get(user=user)

Database mutable field

I have to store data, a part of them is predefined but the user can chose to custom it.
What is the best way to store these data in the database?
2 fields, 1 will be an integer field for predefined option and the second will be a string for the custom user input
1 string field, which will contains a json like {predefined: 2, custom: ''}
1 string field which will contains custom string or predefined option id (converted during the request process)
1 string field which will contains the fulltext option even if it is a predefined (some of these predefined options can be long text)
I tried the 1) but double the number of fields for each "custom ready" data doesn't seem to be perfect...
Any idea ?
Considering you might need the following (it's not very clear from your question):
a form where there is an input field for the customizable part of the string
an easy way to refer to the complete string
a way to administer/manage/validate the non-customizable string
=> use two fields:
class TheModel(Model):
# if you have a certain constant number of choices, use ChoiceField
# otherwise use a ForeingKey and create a different model for those
non_customizable_prefix = ChoiceField(null=False, blank=False, ...)
# unique? validators? max/min length? null/blank?
customizable_part = CharField(...)
#property
def complete_string(self):
return '{}{}'.format(self. non_customizable_prefix, self. customizable_part)
This model will provide you with two separate input fields in Django forms or the Django admin, offering easy ways to make the non_customizable_prefix read only or only modifiable with certain privileges.

Filtering on the concatenation of two model fields in django

With the following Django model:
class Item(models.Model):
name = CharField(max_len=256)
description = TextField()
I need to formulate a filter method that takes a list of n words (word_list) and returns the queryset of Items where each word in word_list can be found, either in the name or the description.
To do this with a single field is straightforward enough. Using the reduce technique described here (this could also be done with a for loop), this looks like:
q = reduce(operator.and_, (Q(description__contains=word) for word in word_list))
Item.objects.filter(q)
I want to do the same thing but take into account that each word can appear either in the name or the description. I basically want to query the concatenation of the two fields, for each word. Can this be done?
I have read that there is a concatenation operator in Postgresql, || but I am not sure if this can be utilized somehow in django to achieve this end.
As a last resort, I can create a third column that contains the combination of the two fields and maintain it via post_save signal handlers and/or save method overrides, but I'm wondering whether I can do this on the fly without maintaining this type of "search index" type of column.
The most straightforward way would be to use Q to do an OR:
lookups = [Q(name__contains=word) | Q(description__contains=word)
for word in words]
Item.objects.filter(*lookups) # the same as and'ing them together
I can't speak to the performance of this solution as compared to your other two options (raw SQL concatenation or denormalization), but it's definitely simpler.

django - queryset order_by field that has characters and integers

I have a model that has a field (lets call it this_field) which is stored as a string. The values in this_field are in the form Char### - as in, they have values such as: A4 or A34 or A55 (note that they will always have the same first character).
Is there a way to select all the records and order them by this field? Currently, when I do an .order_by('this_field') I get the order like this:
A1
A13
A2
A25
etc...
What I want is:
A1
A2
A13
A25
etc...
Any best approach methods or solutions to achieve this query set ordered properly?
Queryset ordering is handled by the database backend, not by Django. This limits the options for changing the way that ordering is done. You can either load all of the data and sort it with Python, or add additional options to your query to have the database use some kind of custom sorting by defining functions.
Use the queryset extra() function will allow you to do what you want by executing custom SQL for sorting, but at the expense of reduced portability.
In your example, it would probably suffice to split the input field into two sets of data, the initial character, and the remaining integer value. You could then apply a sort to both columns. Here's an example (untested):
qs = MyModel.objects.all()
# Add in a couple of extra SELECT columns, pulling apart this_field into
# this_field_a (the character portion) and this_field_b (the integer portion).
qs = qs.extra(select={
'this_field_a': "SUBSTR(this_field, 1)",
'this_field_b': "CAST(substr(this_field, 2) AS UNSIGNED)"})
The extra call adds two new fields into the SELECT call. The first clause pulls out the first character of the field, the second clause converts the remainder of the field to an integer.
It should now be possible to order_by on these fields. By specifying two fields to order_by, ordering applies to the character field first, then to the integer field.
eg
qs = qs.order_by('this_field_a', 'this_field_b')
This example should work on both MySql and SQLite. It should also be possible to create a single extra field which is used only for sorting, allowing you to specify just a single field in the order_by() call.
If you use this sort order a lot and on bigger tables you should think about two separate fields that contain the separated values:
the alpha values should be lowercased only, in a text or char field, with db_index=True set
the numeric values should be in an integer field with db_index=True set on them.
qs.order_by('alpha_sort_field', 'numeric_sort_field')
Otherwise you will probably experience some (or up to a huge) performance impact.
Another way of doing it is to sort the QuerySet based on the int part of this_field:
qs = ModelClass.objects.all()
sorted_qs = sorted(qs, key=lambda ModelClass: int(ModelClass.this_field[1:]))

Django DB, finding Categories whose Items are all in a subset

I have a two models:
class Category(models.Model):
pass
class Item(models.Model):
cat = models.ForeignKey(Category)
I am trying to return all Categories for which all of that category's items belong to a given subset of item ids (fixed thanks). For example, all categories for which all of the items associated with that category have ids in the set [1,3,5].
How could this be done using Django's query syntax (as of 1.1 beta)? Ideally, all the work should be done in the database.
Category.objects.filter(item__id__in=[1, 3, 5])
Django creates the reverse relation ship on the model without the foreign key. You can filter on it by using its related name (usually just the model name lowercase but it can be manually overwritten), two underscores, and the field name you want to query on.
lets say you require all items to be in the following set:
allowable_items = set([1,3,4])
one bruteforce solution would be to check the item_set for every category as so:
categories_with_allowable_items = [
category for category in
Category.objects.all() if
set([item.id for item in category.item_set.all()]) <= allowable_items
]
but we don't really have to check all categories, as categories_with_allowable_items is always going to be a subset of the categories related to all items with ids in allowable_items... so that's all we have to check (and this should be faster):
categories_with_allowable_items = set([
item.category for item in
Item.objects.select_related('category').filter(pk__in=allowable_items) if
set([siblingitem.id for siblingitem in item.category.item_set.all()]) <= allowable_items
])
if performance isn't really an issue, then the latter of these two (if not the former) should be fine. if these are very large tables, you might have to come up with a more sophisticated solution. also if you're using a particularly old version of python remember that you'll have to import the sets module
I've played around with this a bit. If QuerySet.extra() accepted a "having" parameter I think it would be possible to do it in the ORM with a bit of raw SQL in the HAVING clause. But it doesn't, so I think you'd have to write the whole query in raw SQL if you want the database doing the work.
EDIT:
This is the query that gets you part way there:
from django.db.models import Count
Category.objects.annotate(num_items=Count('item')).filter(num_items=...)
The problem is that for the query to work, "..." needs to be a correlated subquery that looks up, for each category, the number of its items in allowed_items. If .extra had a "having" argument, you'd do it like this:
Category.objects.annotate(num_items=Count('item')).extra(having="num_items=(SELECT COUNT(*) FROM app_item WHERE app_item.id in % AND app_item.cat_id = app_category.id)", having_params=[allowed_item_ids])