Rails ActiveRecord scoped many to many query - ruby-on-rails-4

In my models Event and Artist are both HABTM relations.
Events are default scoped to current events by date so to only show events that haven't happened yet. I'm trying to write a method or scope to get all from Artist that have no current events. I tried ActiveRecord relation
none = Artist.where{|a| a.events.default_scope.count == 0}
But this returns
ActiveRecord::QueryMethods::WhereChain
And then I can't get the actual objects to work with. Then when I iterate over .all it's very slow because there's a ton of data in the models.
Artist.all.select{|a| a.events.default_scope.count == 0}
What is a faster or better way to handle this?

Use a LEFT JOIN on events and the join table, add the condition to include only current events, and get only artists where there is no events:
scope :no_events, -> {
joins('LEFT JOIN artists_events ON artists.id = artists_events.artist_id')
.joins('LEFT JOIN events ON events.id = artists_events.event_id')
.where([here the condition for current events])
.where('events.id IS NULL')
}
You may need to add a distinct

Related

How can you filter a Django query's joined tables then iterate the joined tables in one query?

I have table Parent, and a table Child with a foreign key to table Parent.
I want to run a query for all Parents with a child called Eric, and report Eric's age.
I run:
parents = Parents.objects.filter(child__name='Eric')
I then iterate over the queryset:
for parent in parents:
print(f'Parent name {parent.name} child Eric age {parent.child.age}')
Clearly this doesn't work - I need to access child through the foreign key object manager, so I try:
for parent in parents:
print(f'Parent name {parent.name}')
for child in parent.child_set.all():
print(f'Child Eric age {parent.child.age}')
Django returns all children's ages, not just children named Eric.
I can repeat the filter conditions:
parents = Parents.objects.filter(child__name='Eric')
for parent in parents:
print(f'Parent name {parent.name}')
for child in parent.child_set.filter(name='Eric'):
print(f'Child Eric age {child.age}')
But this means duplicate code (so risks future inconsistency when another dev makes a change to one not the other), and runs a second query on the database.
Is there a way of getting the matching records and iterating over them? Been Djangoing for years and can't believe I can't do this!
PS. I know that I can do Child.objects.filter(name='Eric').select_related('parent'). But what I would really like to do involves a second child table. So add to the above example a table Address with a foreign key to Parent. I want to get parents with children named Eric and addresses in Timbuktu and iterate over the all Timbuktu addresses and all little Erics. This is why I don't want to use Child's object manager.
This is the best I could come up with - three queries, repeating each filter.
children = Children.objects.filter(name='Eric')
addresses = Address.objects.filter(town='Timbuktu')
parents=(
Parent.objects
.filter(child__name='Eric', address__town='Timbuktu')
.prefetch_related(Prefetch('child_set', children))
.prefetch_related(Prefetch('address_set', addresses))
)
The .values function gives you direct access to the recordset returned (thank you #Iain Shelvington):
parents_queryset_dicts = Parent.objects
.filter(child__name='Eric', address__town='Timbuktu')
.values('id', 'name', 'child__id', 'address__id', 'child__age', 'address__whatever')
.order_by('id', 'child__id', 'address__id')
Note though that this retrieves a Cartesian product of children and addresses, so our gain in reduced query count is slightly offset by double-sized result sets and de-duplication below. So I am starting to think two queries using Child.objects and Address.objects is superior - slightly slower but simpler code.
In my actual use case I have multiple, multi-table chains of foreign key joins, so am splitting the query to prevent the Cartesian join, but still making use of the .values() approach to get filtered, nested tables.
If you then want a hierarchical structure, eg, for sending as JSON to the client, to produce:
parents = {
parent_id: {
'name': name,
'children': {
child_id: {
'age': child_age
},
'addresses': {
address_id: {
'whatever': address_whatever
},
},
},
}
Run something like:
prev_parent_id = prev_child_id = prev_address_id = None
parents = {}
for parent in parents_queryset_dicts:
if parent['id'] != prev_parent_id:
parents[parent['id']] = {'name': parent['name'], children: {}, addresses: {}}
prev_parent_id = parent['id']
if parent['child__id'] != prev_child_id:
parents[parent['id']]['children'][parent['child__id']] = {'age': parent['child__age']}
prev_child_id = parent['child__id']
if parent['address__id'] != prev_address_id:
parents[parent['id']]['addresses'][parent['address__id']] = {'whatever': parent['address__whatever']}
prev_address_id = parent['address__id']
This is dense code, and you no longer get access to any fields not explicitly extracted and copied in, including any nested ~_set querysets, and de-duplication of the Cartesian product is not obvious to later developers. You can grab the queryset, keep it, then extract the .values, so you have both from the same, single, database query. But often the three query repeated filters is a bit cleaner, if a couple database queries less efficient:
children = Children.objects.filter(name='Eric')
addresses = Address.objects.filter(town='Timbuktu')
parents_queryset = (
Parent.objects
.filter(child__name='Eric', address__town='Timbuktu')
.prefetch_related(Prefetch('child_set', children))
.prefetch_related(Prefetch('address_set', addresses))
)
parents = {}
for parent in parents_queryset:
parents[parent.id] = {'name': parent['name'], children: {}, addresses: {}}
for child in parent.child_set: # this is implicitly filtered
parents[parent.id]['children'][child.id] = {'age': child.age}
for address in parent.address_set: # also implicitly filtered
parents[parent.id]['addresses'][address.id] = {'whatever': address.whatever}
One last approach, which someone briefly posted then deleted - I'd love to know why - is using annotate and F() objects. I have not experimented with this, the SQL generated looks fine though and it seems to run a single query and not require repeating filters:
from django.db.models import F
parents = (
Parent.objects.filter(child__name='Eric')
.annotate(child_age=F('child__age'))
)
Pros and cons seem identical to .values() above, although .values() seems slightly more basic Django (so easier to read) and you don't have to duplicate field names (eg, with the obfuscation above of child_age=child__age). Advantages might be convenience of . accessors instead of ['field'], you keep hold of the lazy nested recordsets, etc. - although if you're counting the queries you probably want things to fall over if you issue an accidental query per row.

Comparing Date to Minimum Future Value in Other Model

I have two django models that define:
Item = A set of items with an expiry date.
Event = A set of events that have a start and end date.
My aim is that when the item is displayed, its expiry date is shown conditionally formatted, based on whether that item expires after the next event's end date but before the following events end date (so warning when it's due to expire.)
The problem becomes, how best to manage this feat?
If I was directly accessing the database, I'd be using subqueries to get the minimum end date still in the future, and then comparing that to the expiry date on an if basis to swap the formatting.
From research, I'm coming to the conclusion that this is best handled in the view logic, rather than trying to set a method in either of the models, but I don't know how best to get this to splice together so I can return the value of my minimum future date and my itemlist object from the Item model.
The current view is pretty simple at this stage (it will have more filtering options later):
def itemlist(request):
item_list = Item.objects.all
return render(request, "itemlist.html", {'item_list': item_list})
but I cant see a way of easily returning a django equivalent of something like what I'd do in straight SQL:
select item from items where status != expired and expiry_date <= (select min(end_date) from events where end_date >= getdate() )
EDIT: Since I've written this, I've realised the comparison for what I want is a little more complex, as it's not the minimum date, it's the next to minimum.
For Item A, expiry_date 01/05/19
Event A: end_date 25/04/19
Event B: end_date 10/05/19
What I need it to do is check the events when reading back the item list, see that Item A's expiry date is after the next event.end_date for event A, but is before the event.end_date for event B, so set a flag for using conditional formatting on the template's expiry date display.
Eventually, I suppose, the wish list is to also be able to say for every item "what's the latest event I can renew this item before it expires if there's an event in the list after its expiry time."
I could not completely understand your requiremets from your description, but you can use subqueries in Django as well. If you filter like this:
now = datetime.datetime.utcnow()
Item.objects.annotate(last_event_time=Subquery(Event.objects.filter(end_date__gt=now).values('end_date').order_by('-end_date')[:1]))
Each item in the resulting queryset will have last_event_time field, which would keep the latest event end_date field.
You can also use this field in further filtering, using F expressions:
Item.objects.annotate(last_event_time=Subquery(Event.objects.filter(end_date__gt=now).values('end_date').order_by('-end_date')[:1])).filter(expiry_date__lte=F('last_event_time'))

Doctrine ORM - Calculate non-mapped property in query

I have two relevant entities - Thread and Reply, with a user able to post replies to threads.
When I return a list of thread entities using the ORM QueryBuilder I'd like to also include a boolean flag indicating whether the current user has posted a reply to that thread. Initially I thought about adding a property to the Thread entity and somehow setting that in a query, but it doesn't feel like the Thread should need to be aware of users posting replies. What is the best way to deal with this, ideally preventing the need to perform a secondary query per Thread returned?
You could use left join with:
$qb = $em->getRepository(Thread::class)->createQueryBuilder('a');
$qb->addSelect('count(r)');
$qb->leftJoin('a.replies', 'r' , 'WITH', 'r.user = :userId');
$qb->groupBy('a');
$qb->setParameter('userId', 8);
$result = $qb->getQuery()->getResult();
Using count(r) you will get how many replies the user has posted per thread. You will have to loop through the result to check whether the count is > 0.
foreach($result as $row) {
$thread = $row[0];
$hasReplied = ($row[1] > 0 ? true: false);
}

SQL Doctrine: InnerJoin where A.key = B.key or A.key = NULL

I'm trying to do a "complex" SQL request using Doctrine:
I've got a sport project with multiple delegations.
I created events ('e') to tell each ones what will happen next.
One event can be for multiple delegations, and if it's for all, it's linked to no one (to save database space)
Then I've got a ManyToMany relation beetween Events and Delegations
And I would like to fetch all events that concern one delegation ('A') after now(->where('e.startTime > :date'))
This event are linked by an:
->innerJoin('e.delegations', 'd', 'WITH', 'd.name = A' )
This work quite good, but for event associated with no one delegation it doesn't get returned
then I tried:
->leftJoin('e.delegations', 'd', 'WITH', 'd.name = A' )
But this return all the events
Then I need to add an orWhere to get catch the e.delegations = null but I don't know how to use it, because this will break the previous where that was concerning the time.
Or maybe I can specify something in the innerJoin (like a or NULL or something) but I didn't find this in any Doctrine QueryBuilder Doc
I found this in french, but this is only a join and selected using a Where, then it's certainly not optimal compare to an inner/left Join, and will be harder to maintain.
Problem
This problem is a little trick because once we filter delegation by name (d.name = A), for all purposes any event (e) which don't have that delegation, but have others, will be treated as it didn't any. Because the filter will "remove" others delegations to be returned.
Solution
Solve this by using a subquery:
//subquery - only returns event which has delegation 'A'
$sqb->select('e1.id')
->from('MyNamespace\Entity\Event','e1')
->innerJoin('e1.delegations', 'd', 'WITH', "d.name = 'A'");
//main query
$qb
->leftJoin('e.delegations', 'd')
->andWhere('e.delegation is null or e.id in ('.$sqb->getDQL().')');
PS
If you need use a placehold (:param) in a subquery, always set the parameter in the main QueryBuilder.

How to "filter" by "exists" in Django?

I would like to filter a queryset by whether a certain subquery returns any results. In SQL this might look like this:
SELECT * FROM events e WHERE EXISTS
(SELECT * FROM tags t WHERE t.event_id = e.id AND t.text IN ("abc", "def"))
In other words, retrieve all events which are tagged with one of the specified tags.
How might I express this using Django's QuerySet API on models Event and Tag?
You can do something like this:
q = Event.objects.filter(tag__text__in = ['abc', 'def'])
Assuming that there is a ForeignKey from Tag to Event.
Explanation: You are filtering Event objects based on a specific criteria. Using the double underscore syntax you are accessing the text attribute of the Tag instances and then attaching the IN condition. You don't have to worry about the join on foreign key; Django does that for you behind the scenes. In case you are curious to see the query generated, you can print it:
print q.query
Manoj's solution may cause a problem when there are multiple tags for an event.
The SQL inner join returns all the rows so events may have duplicate results, the solution is to add the distinct method.
q = Event.objects.filter(tag__text__in = ['abc', 'def']).distinct()