I want to retrieve a collection of objects based on what they are associated to. For example, by a category. This would be a Many to Many relationship.
I've been able to achieve that with MEMBER OF, however I need to pass in an array of IDs, opposed to one at a time. I see there is an "IN ()", but it seems to require a subquery, which I would like to avoid.
MEMBER OF example:
SELECT o FROM Entity\Object1 o WHERE 'CATEGORY_CODE' MEMBER OF o.categories
(Edit)
This is what I would like to do, but perhaps I'm misunderstanding how entities work in DQL:
SELECT o FROM Entity\Object1 o WHERE o.categories.Id IN (id, id, id)
SELECT o FROM Entity\Object1 o JOIN o.categories c WHERE c.id in ('id', 'id', 'id');
If this isn't what you want, you'll have to edit your question to be more specific.
Related
This is similar to other questions regarding complicated prefetches but seems to be slightly different. Here are some example models:
class Author(Model):
pass
class Book(Model):
authors = ManyToManyField(Author, through='BookAuthor')
class BookAuthor(Model):
book = ForeignKey(Book, ...)
author = ForeignKey(Author, ...)
class Event(Model):
book = ForeignKey(Book, ...)
author = ForeignKey(Author, ...)
In summary: a BookAuthor links a book to one of its authors, and an Event also concerns a book and one of its authors (the latter constraint is unimportant). One could design the relationships differently but it's too late for that now, and in my case, this in fact is part of migrating away from the current setup.
Now suppose I have a BookAuthor instance and want to find any events that relate to that combination of book and author. I can follow either the book or author relations on BookAuthor and their event_set reverse relationship, but neither gives me what I want and, if I have several BookAuthors it becomes a pain.
It seems that the way to get an entire queryset "annotated" onto a model instance is via prefetch_related but as far as I can tell there is no way, in the Prefetch object's queryset property, to refer to the root object's fields. I would like to do something like this:
BookAuthor.objects.filter(...).prefetch_related(
Prefetch(
'book__event_set',
queryset=Event.objects.filter(
author=OuterRef('author')
)
}
)
However, OuterRef only works in subqueries and this is not one. The answer to this question suggests I could use a subquery here but I don't understand how it could ever work: you have to put the subquery inside a query to work in the Prefetch, and the OuterRef refers to that object/DB row; there is no way to get back to the original, root object. If I translate the code there into my situation it is clear that the OuterRef is referring to the outer Event query, not the outer BookAuthor.
To make the question precise: what I want is O(1) queries which, for a queryset of BookAuthors annotate each instance - or one of its foreign keys - with a collection of the corresponding Events. Obviously one can get all the Events and glue everything together in python. I want to avoid this but if anyone has a particularly elegant way of doing that, it would also be useful to know.
Lets suppose the following model:
class Person(models.Model):
name = models.CharField(null=False)
...
I know that I can filter Persons with a name that contains the letter a using Person.objects.filter(name__contains='a'). And I know that I can filter Persons with name is in a list using Person.objects.filter(name__in=['John Doe', 'Mary Jane']).
Its possible ( or, whats the performative way ) to do the two things using just one filter ?
I know that I can do 2 queries (maybe more) and fetch the data. But in my current case, I have a method on my view called get_filters that returns a Q object that will be used in get_queryset method. So I need to implement this inside a Q object and fetching only 1 query. Its possible ?
Exactly as you figured, you can build up a Q() filter object.
Q(name__contains='a') & Q(name__in=['John Doe', 'Mary Jane'])
However, that will only ever match an object Mary Jane since John Doe doesn't contain an a. (The equivalent SQL is name LIKE '%a%' AND name IN ('John Doe', 'Mary Jane').
If you mean "find any object containing any of these substrings", that's possible too:
q = Q()
for substring in ['John', 'Mary', 'Jane']:
q |= Q(name__contains=substring)
This will be the equivalent of name LIKE '%John%' OR name LIKE '%Mary%' OR name LIKE '%Jane%'.
class A(models.Model)
results = models.TextField()
class B(models.Model)
name = models.CharField(max_length=20)
res = models.ManyToManyField(A)
Let's suppose we have above 2 models. A model has millions of objects.
I would like to know what would be the best efficient/fastest way to get all the results objects of a particular B object.
Let's suppose we have to retrieve all results for object number 5 of B
Option 1 : A.objects.filter(b__id=5)
(OR)
Option 2 : B.objects.get(id=5).res.all()
Option 1: My Question is filtering by id on A model objects would take lot of time? since there are millions of A model objects.
Option 2: Question: does res field on B model stores the id value of A model objects?
The reason why I'm assuming the option 2 would be a faster way since it stores the reference of A model objects & directly getting those object values first and making the second query to fetch the results. whereas in the first option filtering by id or any other field would take up a lot of time
The first expression will result in one database query. Indeed, it will query with:
SELECT a.*
FROM a
INNER JOIN a_b ON a_b.a_id = a.id
WHERE a_b.b_id = 5
The second expression will result in two queries. Indeed, first Django will query to fetch that specific B object with a query like:
SELECT b.*
FROM b
WHERE b.id = 5
then it will make exactly the same query to retrieve the related A objects.
But retrieving the A object is here not necessary (unless you of course need it somewhere else). You thus make a useless database query.
My Question is filtering by id on A model objects would take lot of time? since there are millions of A model objects.
A database normally stores an index on foreign key fields. This thus means that it will filter effectively. The total number of A objects is usually not (that) relevant (since it uses a datastructure to accelerate search like a B-tree [wiki]). The wiki page has a section named An index speeds the search that explains how this works.
I'm implementing search functionality with an option of looking for a record by matching multiple tables and multiple fields in these tables.
Say I want to find a Customer by his/her first or last name, or by ID of placed Order which is stored in different model than Customer.
The easy scenario which I already implemented is that a user only types single word into search field, I then use Django Q to query Order model using direct field reference or related_query_name reference like:
result = Order.objects.filter(
Q(customer__first_name__icontains=user_input)
|Q(customer__last_name__icontains=user_input)
|Q(order_id__icontains=user_input)
).distinct()
Piece of a cake, no problems at all.
But what if user wants to narrow the search and types multiple words into search field.
Example: user has typed Bruce and got a whole lot of records back as a result of search.
Now he/she wants to be more specific and adds customer's last name to search.So the search becomes Bruce Wayne, after splitting this into separate parts I'm having Bruce and Wayne. Obviously I don't want to search Orders model because order_id is a single-word instance and it's sufficient to find customer at once so for this case I'm dropping it out of query at all.
Now I'm trying to match customer by both first AND last name, I also want to handle the scenario where the order of provided data is random, to properly handle Bruce Wayne and Wayne Bruce, meaning I still have customers full name but the position of first and last name aren't fixed.
And this is the question I'm looking answer for: how to build query that will search multiple fields of model not knowing which of search words belongs to which table.
I'm guessing the solution is trivial and there's for sure an elegant way to create such a dynamic query, but I can't think of a way how.
You can dynamically OR a variable number of Q objects together to achieve your desired search. The approach below makes it trivial to add or remove fields you want to include in the search.
from functools import reduce
from operator import or_
fields = (
'customer__first_name__icontains',
'customer__last_name__icontains',
'order_id__icontains'
)
parts = []
terms = ["Bruce", "Wayne"] # produce this from your search input field
for term in terms:
for field in fields:
parts.append(Q(**{field: term}))
query = reduce(or_, parts)
result = Order.objects.filter(query).distinct()
The use of reduce combines the Q objects by ORing them together. Credit to that part of the answer goes to this answer.
The solution I came up with is rather complex, but it works exactly the way I wanted to handle this problem:
search_keys = user_input.split()
if len(search_keys) > 1:
first_name_set = set()
last_name_set = set()
for key in search_keys:
first_name_set.add(Q(customer__first_name__icontains=key))
last_name_set.add(Q(customer__last_name__icontains=key))
query = reduce(and_, [reduce(or_, first_name_set), reduce(or_, last_name_set)])
else:
search_fields = [
Q(customer__first_name__icontains=user_input),
Q(customer__last_name__icontains=user_input),
Q(order_id__icontains=user_input),
]
query = reduce(or_, search_fields)
result = Order.objects.filter(query).distinct()
I'm having trouble reducing the number of queries for a particular view. It's a fairly heavy one but I'm sure it can be reduced:
Profile:
name = CharField()
Officers:
club= ManyToManyField(Club, related_name='officers')
title= CharField()
Club:
name = CharField()
members = ManyToManyField(Profile)
Election:
club = ForeignKey(Club)
elected = ForeignKey(Profile)
title= CharField()
when = DateTimeField()
Clubs have members and officers (president, tournament director). People can be members of multiple clubs etc...
Officers are elected at elections, the results of which are stored.
Given a player how can I find out the most recently elected officer at each of the players clubs?
At the moment I have
clubs = Club.objects.filter(members=me).prefetch_related('officers')
for c in clubs:
officers = c.officers.all()
most_recent = Elections.objects.filter(club=c).filter(elected__in=officers).order_by('-when')[:1].get()
print(c.name + ' elected ' + most_recent.name + ' most recently')
Problem is the looped query, it's nice and fast if you're a member of 1 club but if you join fifty my database crawls.
Edit:
The answer from Nil does what I want but doesn't get the object. I don't really need the object but I do need another field as well as the datetime. If it's helpful the query:
Club.objects.annotate(last_election=Max('election__when'))
produces the raw SQL
SELECT "organisation_club"."id", "organisation_club"."name", MAX("organisation_election"."when") AS "last_election"
FROM "organisation_club"
LEFT OUTER JOIN "organisation_election" ON ( "organisation_club"."id" = "organisation_election"."club_id" )
GROUP BY "organisation_club"."id", "organisation_club"."name"
I'd really like an ORM answer if at all possible (or a 'mostly' ORM answer).
I believe this is what you're looking for:
from django.db.models import Max, F
Election.objects.filter(club__members=me) \
.annotate(max_date=Max('club__election_set__when')) \
.filter(when=F('max_date')).select_related('elected')
Relations can be followed forwards and backwards again in a single statement, allowing you to annotate the max_date for any election related to the club of the current election. The F class allows you to filter a queryset based on selected fields in SQL, including any extra fields added through annotation, aggregation, joins etc.
What you want is defined here in SQL term: query the Election table, group them by Club and keep only the last election of each club.
Now, how can we translate that in Django ORM? Looking at the documentation, we learn that we can do it with an annotation. The trick is that you need to think in reverse. You want to annotate (add a new data) each club with its last election. This gives us:
Club.objects.annotate(last_election=Max('election__when'))
# Use it in a for loop like that
for club in Club.objects.annotate(last_election=Max('election__when')):
print(club, club.last_election)
Sadly, this only adds the date, which doesn't answer your question! You want the name or the complete Club object. I searched and I still don't know how to do it properly. If everything fails though, you can still use a raw SQL query in Django using a query like in the first link.
The simplest way I can think of is filtering partially at the application level
If you do
e = Election.objects.filter(club__members=me).select_related('elected')
or
e = me.club_set.election_set.select_related('elected')
This is a single query and it should get back all the elections that happened for the all the clubs that the member me is in. Then you can use python to just get the most recent date. Of course, if you have many elections per club, you end up fetching much more data than will be used.
Another way which should do it in two queries:
# Get all member's clubs & most recent election
clubs = Club.objects.filter(members=me).annotate(last_election=Max('election__when'))
# Create filters for election based on the club id and the latest election time
election_Q = [Q(club__id=c.id) & Q(when=c.last_election) for c in clubs]
# Combine filters with an OR
election_filter = reduce(lambda f1, f2: f1 | f2, election_Q)
# Get elections restricting by specific clubs & election date
elections = Election.objects.filter(election_filter).select_related('elected')
for e in elections:
print '%s elected %s most recently at %s' % (e.club.name, e.elected, e.when)
This builds upon #Nil's method and uses its result to build a query in python, then feeds it into the second query. However, there is a limit with the size of a SQL statement and if there are a lot of clubs that a member is in, then you may hit the limit. The limit is fairly high though and I've only ever reached it when importing large datasets in a single INSERT statement so I think it should be fine for your purpose.
Sorry I cannot think of a way that the Django ORM can link them together using a single SQL query. The Django ORM is actually quite limited for complex queries so if you really need the efficiency I think it's probably best to write the raw SQL query.