I recently asked how to solve a simple SQL query. Turns out that there are many solutions.
After some benchmarking i think this is the best one:
SELECT DISTINCT Camera.*
FROM Camera c
INNER JOIN cameras_features fc1 ON c.id = fc1.camera_id AND fc1.feature_id = 1
INNER JOIN cameras_features fc2 ON c.id = fc2.camera_id AND fc2.feature_id = 2
Now, I've not clue how to perform this query with Django ORM.
If you need exactly this query you can execute this in django like raw sql. Here you can find about raw sql in django.
It`s good to put your sql code into a custom manager. An example with manager and raw sql can be found here
Related
If we have 2 models A, B with a many to many relation.
I want to obtain a sql query similar to this:
SELECT *
FROM a LEFT JOIN ab_relation
ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;
So in django when I try:
A.objects.prefetch_related('bees')
I get 2 queries similar to:
SELECT * FROM a;
SELECT ab_relation.a_id AS prefetch_related_val_a_id, b.*
FROM b JOIN ab_relation ON b.id = ab_relation.b_id
WHERE ab_relation.a_id IN (123, 456... list of all a.id);
Given that A and B have moderately big tables, I find the way django does it too slow for my needs.
The question is: Is it possible to obtain the left join manually written query through the ORM?
Edits to answer some clarifications:
Yes a LEFT OUTER JOIN would be preferable to get all A's in the queryset, not only those with a relation with B (updated sql).
Moderately big means ~4k rows each, and too slow means ~3 seconds (on first load, before redis cache.) Keep in mind there are other queries on the page.
Actually yes, we need only B.one_field but having tried with Prefetch('bees', queryset=B.objects.values('one_field')) an error said you can't use values in a prefetch.
The queryset will be used as options for a multi-select form-field, where we will need to represent A objects that have a relation with B with an extra string from the B.field.
For the direct answer skip to point 6)
Let'ts talk step by step.
1) N:M select. You say you want a query like this:
SELECT *
FROM a JOIN ab_relation ON ab_relation.a_id = a.id
JOIN b ON ab_relation.b_id = b.id;
But this is not a real N:M query, because you are getting only A-B related objects The query should use outer joins. At least like:
SELECT *
FROM a left outer JOIN
ab_relation ON ab_relation.a_id = a.id left outer JOIN
b ON ab_relation.b_id = b.id;
In other cases you are getting only A models with a related B.
2) Read big tables You say "moderately big tables". Then, are you sure you want to read the whole table from database? This is not usual on a web environment to read a lot of data, and, in this case, you can paginate data. May be is not a web app? Why you need to read this big tables? We need context to answer your question. Are you sure you need all fields from both tables?
3) Select * from Are you sure you need all fields from both tables? May be if you read only some values this query will run faster.
A.objects.values( "some_a_field", "anoter_a_field", "Bs__some_b_field" )
4) As summary. ORM is a powerful tool, two single read operations are "fast". I write some ideas but perhaps we need more context to answer your question. What means moderate big tables, wheat means slow, what are you doing with this data, how many fields or bytes has each row from each table, ... .
Editedd Because OP has edited the question.
5) Use right UI controls. You say:
The queryset will be used as options for a multi-select form-field, where we will need to represent A objects that have a relation with B with an extra string from the B.field.
It looks like an anti-pattern to send to client 4k rows for a form. I suggest to you to move to a live control that loads only needed data. For example, filtering by some text. Take a look to django-select2 awesome project.
6) You say
The question is: Is it possible to obtain the left join manually written query through the ORM?
The answer is: Yes, you can do it using values, as I said it on point 3. Sample: Material and ResultatAprenentatge is a N:M relation:
>>> print( Material
.objects
.values( "titol", "resultats_aprenentatge__codi" )
.query )
The query:
SELECT "material_material"."titol",
"ufs_resultataprenentatge"."codi"
FROM "material_material"
LEFT OUTER JOIN "material_material_resultats_aprenentatge"
ON ( "material_material"."id" =
"material_material_resultats_aprenentatge"."material_id" )
LEFT OUTER JOIN "ufs_resultataprenentatge"
ON (
"material_material_resultats_aprenentatge"."resultataprenentatge_id" =
"ufs_resultataprenentatge"."id" )
ORDER BY "material_material"."data_edicio" DESC
I've this query. Orders post records by last comment on post. This query works well with small tables. However, I've filled database with random data approximately 2M rows on comment table. Analyzed query with explain and saw that sequential scan is performed on Post table.
Post.objects.extra(select={'last_update': 'select max(c.create_date) from comment_comment c where c.post_id = post_post.id'}).order_by('-last_update')
I've rewritten same query which is faster than current one. But I could not find a way to fit the query on django's orm. How can I rewrite it? If it is possible, I want to write it not using raw query as much as possible.
Regards. Thanks for any help.
select
p.*,
t.last_update
from
post_post p
join
( select c.post_id as pid, max(c.create_date) as last_update from comment_comment c group by pid) t
on p.id = t.pid
order by t.last_update desc
limit 50;
If I make some assumptions about your Django model, it will look something like this:
posts.objects
.annotate(last_update=Max('comments__create_date'))
.order_by('-last_update')[:50]
In Django, annotate is your friend.
I am trying to get the field value of a joined table. This is the generated sql of ORM query.
SELECTsubnets_subnetoption.id,
subnets_subnetoption.subnet_id,subnets_subnetoption.value_id,
subnets_subnet.id,subnets_subnet.parent_id,
subnets_subnet.base_address,subnets_subnet.bits,
subnets_subnet.bcast_address,subnets_subnet.is_physical,
subnets_subnet.name,subnets_subnet.responsible,
subnets_subnet.building_floor,subnets_subnet.comments,
subnets_subnet.vlan_common_name,subnets_subnet.creation_date,
subnets_subnet.modification_date,subnets_subnet.sec_level,
subnets_subnet.confid,subnets_subnet.access_type,
subnets_subnet.zone_type,options_value.id,
options_value.content,options_value.comment,
options_value.option_id,options_option.id,
options_option.name,options_option.required,
options_option.scope_id,options_scope.id,
options_scope.nameFROMsubnets_subnetoptionINNER JOIN
subnets_subnetON (subnets_subnetoption.subnet_id=
subnets_subnet.id) INNER JOINoptions_valueON
(subnets_subnetoption.value_id=options_value.id) INNER JOIN
options_optionON (options_value.option_id=
options_option.id) INNER JOINoptions_scopeON
(options_option.scope_id=options_scope.id) WHERE
subnets_subnetoption.subnet_id` = 1
SubnetOption.objects.select_related().filter(subnet_id=subnet['id']).query
I need only options_value.content and options_option.name, but query set i giving the subnetoption table values only. How can I get the joined tables values. I am new to django
SubnetOption.objects.filter(subnet_id=subnet['id']).select_related().values('options_value__content')
or
SubnetOption.objects.filter(subnet_id=subnet['id']).select_related('modelname_in_wholelowercase')
try this once
You can use raw query of django, which means you can put SQL query as it is, for reference
Raw sql queries in Django views
I have two models:
class Note(model):
<attribs>
class Permalink(model):
note = foreign key to Note
I want to execute a query: get all notes which don't have a permalink.
In SQL, I would do it as something like:
SELECT * FROM Note WHERE id NOT IN (SELECT note FROM Permalink);
Wondering how to do this in ORM.
Edit: I don't want to get all the permalinks out into my application. Would instead prefer it to run as a query inside the DB.
You should be able to use this query:
Note.objects.filter(permalink_set__isnull=True)
you can use:
Note.objects.exclude(id__in=Permalink.objects.all().values_list('id', flat=True))
How can I count records with multiple constraints using django's aggregate functionality?
Using django trunk I'm trying to replace a convoluted database-specific SQL statement with django aggregates. As an example, say I have a database structured with tables for blogs running on many domains (think .co.uk, .com, .etc), each taking many comments:
domains <- blog -> comment
The following SQL counts comments on a per-domain basis:
SELECT D.id, COUNT(O.id) as CommentCount FROM domain AS D
LEFT OUTER JOIN blog AS B ON D.blog_id = B.id
LEFT OUTER JOIN comment AS C ON B.id = C.blog_id
GROUP BY D.id
This is easily replicated with:
Domain.objects.annotate(Count('blogs__comments'))
Taking this a step further, I'd like to be able to add one or more constraints and replicate the following SQL:
SELECT D.id, COUNT(O.id) as CommentCount FROM domain AS D
LEFT OUTER JOIN blog AS B ON D.blog_id = B.id
LEFT OUTER JOIN comment AS C ON B.id = C.blog_id
AND C.active = True
GROUP BY D.id
This is much more difficult to replicate as django seems including to filter on the whole shaboodle with a WHERE clause:
Domain.objects.filter(blogs__comments__active=True)
.annotate(Count('blogs__comments'))
SQL comes out something like this:
SELECT ..., COUNT(comment.id) AS blog__comments__count FROM domain
LEFT OUTER JOIN blog ON domain.blog_id = blog.id
LEFT OUTER JOIN comment ON blog.id = comment.blog_id
WHERE comment.active = True
GROUP BY domain.id
ORDER BY NULL
How can I persuade django to pop the extra constraint on the appropriate LEFT OUTER JOIN? This is important as I want to include a count for those blogs with no comments.
I don't know how to do this using the Django query language, but you could always run a raw SQL query. In case you don't already know how to do that, here's an example:
from django.db import connection
def some_method(request, some_parameter):
cursor = connection.cursor()
cursor.execute('SELECT * FROM table WHERE somevar=%s', [some_parameter])
rows = cursor.fetchall()
More detail is available in the Django book online: http://www.djangobook.com/en/2.0/chapter05/
Look for the section "The “Dumb” Way to Do Database Queries in Views". If you don't want to use the "dumb" way, I'm not sure what your options are.