Django ORM query with IN + LIKE

Django ORM query with IN + LIKE - django

Lets suppose the following model:
class Person(models.Model):
name = models.CharField(null=False)
...
I know that I can filter Persons with a name that contains the letter a using Person.objects.filter(name__contains='a'). And I know that I can filter Persons with name is in a list using Person.objects.filter(name__in=['John Doe', 'Mary Jane']).
Its possible ( or, whats the performative way ) to do the two things using just one filter ?
I know that I can do 2 queries (maybe more) and fetch the data. But in my current case, I have a method on my view called get_filters that returns a Q object that will be used in get_queryset method. So I need to implement this inside a Q object and fetching only 1 query. Its possible ?

Exactly as you figured, you can build up a Q() filter object.
Q(name__contains='a') & Q(name__in=['John Doe', 'Mary Jane'])
However, that will only ever match an object Mary Jane since John Doe doesn't contain an a. (The equivalent SQL is name LIKE '%a%' AND name IN ('John Doe', 'Mary Jane').
If you mean "find any object containing any of these substrings", that's possible too:
q = Q()
for substring in ['John', 'Mary', 'Jane']:
q |= Q(name__contains=substring)
This will be the equivalent of name LIKE '%John%' OR name LIKE '%Mary%' OR name LIKE '%Jane%'.

Related

Django ORM. Filter many to many with AND clause

With the following models:
class Item(models.Model):
name = models.CharField(max_length=255)
attributes = models.ManyToManyField(ItemAttribute)
class ItemAttribute(models.Model):
attribute = models.CharField(max_length=255)
string_value = models.CharField(max_length=255)
int_value = models.IntegerField()
I also have an Item which has 2 attributes, 'color': 'red', and 'size': 3.
If I do any of these queries:
Item.objects.filter(attributes__string_value='red')
Item.objects.filter(attributes__int_value=3)
I will get Item returned, works as I expected.
However, if I try to do a multiple query, like:
Item.objects.filter(attributes__string_value='red', attributes__int_value=3)
All I want to do is an AND. This won't work either:
Item.objects.filter(Q(attributes__string_value='red') & Q(attributes__int_value=3))
The output is:
<QuerySet []>
Why? How can I build such a query that my Item is returned, because it has the attribute red and the attribute 3?

If it's of any use, you can chain filter expressions in Django:
query = Item.objects.filter(attributes__string_value='red').filter(attributes__int_value=3')
From the DOCS:
This takes the initial QuerySet of all entries in the database, adds a filter, then an exclusion, then another filter. The final result is a QuerySet containing all entries with a headline that starts with “What”, that were published between January 30, 2005, and the current day.
To do it with .filter() but with dynamic arguments:
args = {
'{0}__{1}'.format('attributes', 'string_value'): 'red',
'{0}__{1}'.format('attributes', 'int_value'): 3
}
Product.objects.filter(**args)
You can also (if you need a mix of AND and OR) use Django's Q objects.
Keyword argument queries – in filter(), etc. – are “AND”ed together. If you need to execute more complex queries (for example, queries with OR statements), you can use Q objects.
A Q object (django.db.models.Q) is an object used to encapsulate a
collection of keyword arguments. These keyword arguments are specified
as in “Field lookups” above.
You would have something like this instead of having all the Q objects within that filter:
** import Q from django
from *models import Item
#assuming your arguments are kwargs
final_q_expression = Q(kwargs[1])
for arg in kwargs[2:..]
final_q_expression = final_q_expression & Q(arg);
result = Item.objects.filter(final_q_expression)
This is code I haven't run, it's out of the top of my head. Treat it as pseudo-code if you will.
Although, this doesn't answer why the ways you've tried don't quite work. Maybe it has to do with the lookups that span relationships, and the tables that are getting joined to get those values. I would suggest printing yourQuerySet.query to visualize the raw SQL that is being formed and that might help guide you as to why .filter( Q() & Q()) is not working.

Filter multiple Django model fields with variable number of arguments

I'm implementing search functionality with an option of looking for a record by matching multiple tables and multiple fields in these tables.
Say I want to find a Customer by his/her first or last name, or by ID of placed Order which is stored in different model than Customer.
The easy scenario which I already implemented is that a user only types single word into search field, I then use Django Q to query Order model using direct field reference or related_query_name reference like:
result = Order.objects.filter(
Q(customer__first_name__icontains=user_input)
|Q(customer__last_name__icontains=user_input)
|Q(order_id__icontains=user_input)
).distinct()
Piece of a cake, no problems at all.
But what if user wants to narrow the search and types multiple words into search field.
Example: user has typed Bruce and got a whole lot of records back as a result of search.
Now he/she wants to be more specific and adds customer's last name to search.So the search becomes Bruce Wayne, after splitting this into separate parts I'm having Bruce and Wayne. Obviously I don't want to search Orders model because order_id is a single-word instance and it's sufficient to find customer at once so for this case I'm dropping it out of query at all.
Now I'm trying to match customer by both first AND last name, I also want to handle the scenario where the order of provided data is random, to properly handle Bruce Wayne and Wayne Bruce, meaning I still have customers full name but the position of first and last name aren't fixed.
And this is the question I'm looking answer for: how to build query that will search multiple fields of model not knowing which of search words belongs to which table.
I'm guessing the solution is trivial and there's for sure an elegant way to create such a dynamic query, but I can't think of a way how.

You can dynamically OR a variable number of Q objects together to achieve your desired search. The approach below makes it trivial to add or remove fields you want to include in the search.
from functools import reduce
from operator import or_
fields = (
'customer__first_name__icontains',
'customer__last_name__icontains',
'order_id__icontains'
)
parts = []
terms = ["Bruce", "Wayne"] # produce this from your search input field
for term in terms:
for field in fields:
parts.append(Q(**{field: term}))
query = reduce(or_, parts)
result = Order.objects.filter(query).distinct()
The use of reduce combines the Q objects by ORing them together. Credit to that part of the answer goes to this answer.

The solution I came up with is rather complex, but it works exactly the way I wanted to handle this problem:
search_keys = user_input.split()
if len(search_keys) > 1:
first_name_set = set()
last_name_set = set()
for key in search_keys:
first_name_set.add(Q(customer__first_name__icontains=key))
last_name_set.add(Q(customer__last_name__icontains=key))
query = reduce(and_, [reduce(or_, first_name_set), reduce(or_, last_name_set)])
else:
search_fields = [
Q(customer__first_name__icontains=user_input),
Q(customer__last_name__icontains=user_input),
Q(order_id__icontains=user_input),
]
query = reduce(or_, search_fields)
result = Order.objects.filter(query).distinct()

Django get count of each age

I have this model:
class User_Data(AbstractUser):
date_of_birth = models.DateField(null=True,blank=True)
city = models.CharField(max_length=255,default='',null=True,blank=True)
address = models.TextField(default='',null=True,blank=True)
gender = models.TextField(default='',null=True,blank=True)
And I need to run a django query to get the count of each age. Something like this:
Age || Count
10 || 100
11 || 50
and so on.....

Here is what I did with lambda:
usersAge = map(lambda x: calculate_age(x[0]), User_Data.objects.values_list('date_of_birth'))
users_age_data_source = [[x, usersAge.count(x)] for x in set(usersAge)]
users_age_data_source = sorted(users_age_data_source, key=itemgetter(0))

There's a few ways of doing this. I've had to do something very similar recently. This example works in Postgres.
Note: I've written the following code the way I have so that syntactically it works, and so that I can write between each step. But you can chain these together if you desire.
First we need to annotate the queryset to obtain the 'age' parameter. Since it's not stored as an integer, and can change daily, we can calculate it from the date of birth field by using the database's 'current_date' function:
ud = User_Data.objects.annotate(
age=RawSQL("""(DATE_PART('year', current_date) - DATE_PART('year', "app_userdata"."date_of_birth"))::integer""", []),
)
Note: you'll need to change the "app_userdata" part to match up with the table of your model. You can pick this out of the model's _meta, but this just depends if you want to make this portable or not. If you do, use a string .format() to replace it with what the model's _meta provides. If you don't care about that, just put the table name in there.
Now we pick the 'age' value out so that we get a ValuesQuerySet with just this field
ud = ud.values('age')
And then annotate THAT queryset with a count of age
ud = ud.annotate(
count=Count('age'),
)
At this point we have a ValuesQuerySet that has both 'age' and 'count' as fields. Order it so it comes out in a sensible way..
ud = ud.order_by('age')
And there you have it.
You must build up the queryset in this order otherwise you'll get some interesting results. i.e; you can't group all the annotates together, because the second one for count depends on the first, and as a kwargs dict has no notion of what order the kwargs were defined in, when the queryset does field/dependency checking, it will fail.
Hope this helps.
If you aren't using Postgres, the only thing you'll need to change is the RawSQL annotation to match whatever database engine it is that you're using. However that engine can get the year of a date, either from a field or from its built in "current date" function..providing you can get that out as an integer, it will work exactly the same way.

keyword search over several fields in django

I am using the standard User model of django.
I would like to search users by first name and last name. So I have a free text search and using icontains I am searching for the that text.
However, I would like that a user can search for "John A" and retrieve all users which their First name + Last name contains "John A" however, because I am using icontains over first name and last name separately , it doesn't work. "John" alone will work. but as long as I have involved both fields, it doesn't (which make sense)
Any simple solution for this ?

Use Q objects to create your queries, and use | for OR and & for AND.
>>> from django.db.models import Q
>>> qs = Q(first_name__icontains='John A') | Q(last_name__icontains='John A')
>>> User.objects.filter(qs)
read more about it here
p.s. You can also use | and & to join regular querysets
User.objects.filter(...) | User.objects.filter(...)
but I think using Q objects is just neater when going over multiple fields
edit
as par your comment, all you want to do is this:
>>> query = 'John A'
>>> fn, ln = query.split()
>>> User.objects.filter(first_name=fn, last_name=ln)
For complex situations you probably want check the first name and last name. The second one usually is a middle name or part of the others, so icontain will catch it. So something like this:
>>> query = 'Jon Van Mendrik'
>>> fn, ln = [i[0], i[-1] for i in query.split()] #take the first and last
>>> print fn, ln
"Jon Mendrik"
>>> qs = Q(first_name__icontains=fn, last_name__icontains=ln)
>>> qs = qs | Q(first_name__icontains=ln, last_name__icontains=fn)
>>> User.objects.filter(qs).distinct()
It's not very pretty, but will probably work for most cases, including where the last name was entered first and so on. Just make sure you understand what I did here and not just copy it blindly to your project -
I took the first and last parts of the string and assigned them to fn and ln respectively
I create a queryset for the firtcase
I then joined it to another queryset for the reverse option using |
lastly, I filtered on that queryset and used distinct to get rid of possible duplicates
You can probably add a lot more logic. For example, 'John A' - the last name is obviously an acronym and not the entire name while the first is a full name. You can implement some logic that takes care of that and build the queryset accordingly (contains is more costly then startswith or then simple comparison). Using Q objects will make that easy. Play around with it, you'll get the hang of it

I suggest you use natural_key. In your User model, add something like:
def natural_key(self):
return (self.first_name, self.last_name)
result:
users = User.objects.filter()
for user in users:
print(user.natural_key)
You can then filter on natural keys.
Hope it helps.

Django: union of different queryset on the same model

I'm programming a search on a model and I have a problem.
My model is almost like:
class Serials(models.Model):
id = models.AutoField(primary_key=True)
code = models.CharField("Code", max_length=50)
name = models.CharField("Name", max_length=2000)
and I have in the database tuples like these:
1 BOSTON The new Boston
2 NYT New York journal
3 NEWTON The old journal of Mass
4 ANEWVIEW The view of the young people
If I search for the string new, what I want to have is:
first the names that start with the string
then the codes that start with the string
then the names that contain the string
then the codes that contain the string
So the previous list should appear in the following way:
2 NYT New York journal
3 NEWTON The old journal of Mass
1 BOSTON The new Boston
4 ANEWVIEW The view of the young people
The only way I found to have this kind of result is to make different searches (if I put "OR" in a single search, I loose the order I want).
My problem is that the code of the template that shows the result is really redundant and honestly very ugly, because I have to repeat the same code for all the 4 different querysets. And the worse thing is that I cannot use the pagination!
Now, since the structure of the different querysets is the same, I'm wandering if there is a way to join the 4 querysets and give the template only one queryset.

You can make those four queries and then chain them inside your program:
result = itertools.chain(qs1, qs2, qs3, qs4)
but this doesn't seem to nice because your have to make for queries.
You can also write your own sql using raw sql, for example:
Serials.objects.raw(sql_string)
Also look at this:
How to combine 2 or more querysets in a Django view?

You should also be able to do qs1 | qs2 | qs3 | qs4. This will give you duplicates, however.
What you might want to look into is Q() objects:
from django.db.models import Q
value = "new"
Serials.objects.filter(Q(name__startswith=value) |
Q(code__startswith=value) |
Q(name__contains=value) |
Q(code__contains=value).distinct()
I'm not sure if it will handle the ordering if you do it this way, as this would rely on the db doing that.
Indeed, even using qs1 | qs2 may cause the order to be determined by the db. That might be the drawback (and reason why you might need at least two queries).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js