doctrine: fetching all objects which parent is in the list - doctrine-orm

assume I have two objects article and comment. What I would like to do is to fetch all comments, which article.id is IN (1,2,3,4). What is the best way to achieve it ? Am I forced to use JOIN? Right now I have something like this;
dql = "SELECT c FROM Comment c LEFT OUTER JOIN c.article a WHERE a.id IN :articles ORDER BY a.id ASC";
$query = $this->entityManager->createQuery($dql);
$query->setParameter("articles", array(1,2,3,4));
I haven't tested this query but I believe something like this should work, but it would be lovely to write something like c.article IN articles (without JOIN)

This does work, AFAIK:
SELECT c FROM Comment c WHERE c.article IN(:articles)
$query->setParameter('articles', array(1,2,3,4));

Related

Django ORM join many to many relation in one query

If we have 2 models A, B with a many to many relation.
I want to obtain a sql query similar to this:
SELECT *
FROM a LEFT JOIN ab_relation
ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;
So in django when I try:
A.objects.prefetch_related('bees')
I get 2 queries similar to:
SELECT * FROM a;
SELECT ab_relation.a_id AS prefetch_related_val_a_id, b.*
FROM b JOIN ab_relation ON b.id = ab_relation.b_id
WHERE ab_relation.a_id IN (123, 456... list of all a.id);
Given that A and B have moderately big tables, I find the way django does it too slow for my needs.
The question is: Is it possible to obtain the left join manually written query through the ORM?
Edits to answer some clarifications:
Yes a LEFT OUTER JOIN would be preferable to get all A's in the queryset, not only those with a relation with B (updated sql).
Moderately big means ~4k rows each, and too slow means ~3 seconds (on first load, before redis cache.) Keep in mind there are other queries on the page.
Actually yes, we need only B.one_field but having tried with Prefetch('bees', queryset=B.objects.values('one_field')) an error said you can't use values in a prefetch.
The queryset will be used as options for a multi-select form-field, where we will need to represent A objects that have a relation with B with an extra string from the B.field.
For the direct answer skip to point 6)
Let'ts talk step by step.
1) N:M select. You say you want a query like this:
SELECT *
FROM a JOIN ab_relation ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;
But this is not a real N:M query, because you are getting only A-B related objects The query should use outer joins. At least like:
SELECT *
FROM a left outer JOIN
ab_relation ON ab_relation.a_id = a.id left outer JOIN
b ON ab_relation.b_id = b.id;
In other cases you are getting only A models with a related B.
2) Read big tables You say "moderately big tables". Then, are you sure you want to read the whole table from database? This is not usual on a web environment to read a lot of data, and, in this case, you can paginate data. May be is not a web app? Why you need to read this big tables? We need context to answer your question. Are you sure you need all fields from both tables?
3) Select * from Are you sure you need all fields from both tables? May be if you read only some values this query will run faster.
A.objects.values( "some_a_field", "anoter_a_field", "Bs__some_b_field" )
4) As summary. ORM is a powerful tool, two single read operations are "fast". I write some ideas but perhaps we need more context to answer your question. What means moderate big tables, wheat means slow, what are you doing with this data, how many fields or bytes has each row from each table, ... .
Editedd Because OP has edited the question.
5) Use right UI controls. You say:
The queryset will be used as options for a multi-select form-field, where we will need to represent A objects that have a relation with B with an extra string from the B.field.
It looks like an anti-pattern to send to client 4k rows for a form. I suggest to you to move to a live control that loads only needed data. For example, filtering by some text. Take a look to django-select2 awesome project.
6) You say
The question is: Is it possible to obtain the left join manually written query through the ORM?
The answer is: Yes, you can do it using values, as I said it on point 3. Sample: Material and ResultatAprenentatge is a N:M relation:
>>> print( Material
.objects
.values( "titol", "resultats_aprenentatge__codi" )
.query )
The query:
SELECT "material_material"."titol",
"ufs_resultataprenentatge"."codi"
FROM "material_material"
LEFT OUTER JOIN "material_material_resultats_aprenentatge"
ON ( "material_material"."id" =
"material_material_resultats_aprenentatge"."material_id" )
LEFT OUTER JOIN "ufs_resultataprenentatge"
ON (
"material_material_resultats_aprenentatge"."resultataprenentatge_id" =
"ufs_resultataprenentatge"."id" )
ORDER BY "material_material"."data_edicio" DESC

DQL join between unrelated entities?

Can I have a DQL join between unrelated entities using WITH DQL operator? OR is definign relationship absolutely mandatory?
I have a unidirectional relationship with Category and CategorySubsription. Where CategorySubscription has a many to one unidirectional relationship with Category. I want to grab a list of Categories c and left join CategorySubscription cs WITH cs.category_id = c.id AND cs.user_id = value.
Can I do this somehow?
Starting with Doctrine version 2.3 you can, as mentioned in a blog.
It's also mentioned in the docs if you know where to look. Scroll down to the last example under "15.2.4. DQL SELECT Examples":
Joins between entities without associations were not possible until version 2.4, where you can generate an arbitrary join with the following syntax:
<?php
$query = $em->createQuery('SELECT u FROM User u JOIN Blacklist b WITH u.email = b.email');
I know it says "not possible until version 2.4", but it definitely works starting with 2.3!
You can try using multiple from() methods and join conditions put in where() or andWhere methods()

Can I use "ON" keyword in DQL or do I need to use Native Query?

I have a oneToMany relationship between Post and PostVote. I would like to be able to retrieve a Post and how a specific user voted on it. In DQL I would like to retrieve Post and related PostVote entity, but only one where user_id is for example 5.
I don't seem to be able to use ON sql keyword like this:
->createQuery('SELECT p, pv FROM Post p LEFT JOIN p.postvotes ON pv.user = :userid WHERE p.id = :postid')
And if I use WHERE to filter out results, it create a problem and does not display a result unless at least one postvote object exists:
->createQuery('SELECT p, pv FROM Post p LEFT JOIN p.postvotes WHERE p.id = :postid' AND pv.user = :userid)
Is Native Query the only way I can achieve this?
You have to use the WITH keyword in DQL to accomplish this:
SELECT p, pv FROM Post p LEFT JOIN p.postvotes WITH pv.user = :userid WHERE p.id = :postid

django "a contains b" on ManyToManyField between A and B

I have two tables A and B. A.bs is a ManyToManyField onto B.
I want to fetch all a in A where a.bs contains a certain b from B.
The only way I know how to do it is like this:
def get_all_A_containing_b(b):
return filter(lambda a: b in a.bs, A.objects.all())
I'd prefer to have this all done by the DBMS, but I don't want to write any SQL code or use django internals.
The SQL would look something like this: (I can't remember the semantics of JOIN and nulls so this may be wrong)
SELECT * FROM A a
LEFT JOIN A2B a2b on a2b.a_id = a.id
LEFT JOIN B b on a2b.b_id = b.id
WHERE b.id = $b;
where $b is replaced with the id of the b from B I want.
whats the problem with
as = A.objects.filter(bs=b)?
Have you tried using the reverse lookup through one of the automatic _set attributes?
b = B.objects.get( b_id)
a_list = b.a_set.all()
I am answering from my mobile so I can't test if this works.
-Justin

Can I filter on multiple contrains with django aggregate functionlity?

How can I count records with multiple constraints using django's aggregate functionality?
Using django trunk I'm trying to replace a convoluted database-specific SQL statement with django aggregates. As an example, say I have a database structured with tables for blogs running on many domains (think .co.uk, .com, .etc), each taking many comments:
domains <- blog -> comment
The following SQL counts comments on a per-domain basis:
SELECT D.id, COUNT(O.id) as CommentCount FROM domain AS D
LEFT OUTER JOIN blog AS B ON D.blog_id = B.id
LEFT OUTER JOIN comment AS C ON B.id = C.blog_id
GROUP BY D.id
This is easily replicated with:
Domain.objects.annotate(Count('blogs__comments'))
Taking this a step further, I'd like to be able to add one or more constraints and replicate the following SQL:
SELECT D.id, COUNT(O.id) as CommentCount FROM domain AS D
LEFT OUTER JOIN blog AS B ON D.blog_id = B.id
LEFT OUTER JOIN comment AS C ON B.id = C.blog_id
AND C.active = True
GROUP BY D.id
This is much more difficult to replicate as django seems including to filter on the whole shaboodle with a WHERE clause:
Domain.objects.filter(blogs__comments__active=True)
.annotate(Count('blogs__comments'))
SQL comes out something like this:
SELECT ..., COUNT(comment.id) AS blog__comments__count FROM domain
LEFT OUTER JOIN blog ON domain.blog_id = blog.id
LEFT OUTER JOIN comment ON blog.id = comment.blog_id
WHERE comment.active = True
GROUP BY domain.id
ORDER BY NULL
How can I persuade django to pop the extra constraint on the appropriate LEFT OUTER JOIN? This is important as I want to include a count for those blogs with no comments.
I don't know how to do this using the Django query language, but you could always run a raw SQL query. In case you don't already know how to do that, here's an example:
from django.db import connection
def some_method(request, some_parameter):
cursor = connection.cursor()
cursor.execute('SELECT * FROM table WHERE somevar=%s', [some_parameter])
rows = cursor.fetchall()
More detail is available in the Django book online: http://www.djangobook.com/en/2.0/chapter05/
Look for the section "The “Dumb” Way to Do Database Queries in Views". If you don't want to use the "dumb" way, I'm not sure what your options are.