Django select inherited model in a single query - django

Let's Assume I have a parent and a child tables which are implemented via inheritance in django.
models.py
class A(models.Model)
a = CharField()
class B(A):
b = CharField()
Now I want to select column b from table B I execute:
B.objects.only('b').get(id=4)
But this statement queries database 2 times:
SELECT `b`.`a_ptr_id`, `b`.`b` FROM `b` WHERE `b`.`a_ptr_id` = 4; args=(4,)
SELECT `a`.`a`, `b`.`a_id` FROM `b` INNER JOIN `a` ON (`b`.`a_ptr_id` = `b`.`id`) WHERE `b`.`a_ptr_id` = 4; args=(4,)
How do I generate SINGLE query like select b from b where a_ptr_id = ? using django models?
I want to query database one single time!

It turns out that only 1 query was generated. The 2nd query was caused because I checked all of this in debug mode. My IDE evaluated the object automatically which led to querying the database with
SELECT `a`.`a`, `b`.`a_id` FROM `b` INNER JOIN `a` ON (`b`.`a_ptr_id` = `b`.`id`) WHERE `b`.`a_ptr_id` = 4;

Related

Django orm subquery - in clause without substitution

I need to build a query using Django ORM, that looks like this one in SQL:
select * from A where id not in (select a_id from B where ... )
I try to use such code:
ids = B.objects.filter(...)
a_objects = A.object.exclude(id__in=Subquery(ids.values('a__id'))).all()
The problem is that instead of nested select Django generates query that looks like
select * from A where id not in (1, 2, 3, 4, 5 ....)
where in clause explicitly lists all ids that should be excluded, making result sql unreadable when it is printed into logs. Is it possible to adjst this query, so nested select is used?
So I see that your goal is to get all the A's that have no foreign key relations from B's. If I'm right, then you can just use inverse lookup to do it.
So, when you define models like that:
class A:
pass
class B:
a = ForeignKey(to=a, related_name='bs')
You can filter it like this:
A.objects.filter(bs__isnull=True)
Also, if you don't define related_name, it will default to b_set, so you will be able to A.objects.filter(b_set__isnull=True)
to make a filter on B you can
ids = B.objects.filter(x=x).values_list('id',flat=true)
you get a list of ids then make
a_objects = A.object.exclude(id__in=ids)
as mentioned before if there is a relation
You don't need to do anything special, just use the queryset directly in your filter.
ids = B.objects.filter(...)
a_objects = A.object.exclude(id__in=ids).all()
# that should generate the subquery statement
select * from A where NOT (id in (select a_id from B where ... ))

How do I determine what "key path" arguments to provide to `prefetch_related` in Django?

Note: below, I am going to refer to the arguments supplied to prefetch_related as "key paths". I don't know if that's the best/correct term - so let me know if there's a better term to use and I will update the question.
I created an advanced search page in django that searches any of a number of fields from 6 different tables (not all of which are single a direct foreign key path) and displays selected fields from all those tables in a results table. The "key paths" included are:
msrun__sample
msrun__sample__tissue
msrun__sample__animal
msrun__sample__animal__tracer_compound
msrun__sample__animal__studies
(Note: no msrun fields are included in the search or display. That specific model class in this particular view only serves as a connection between the model classes involved in the view.)
It makes a huge difference in the run time when I include a prefetch like: .prefetch_related("msrun__sample__animal__studies"), but I see no discernible difference when I include any additional prefetch "key paths".
My question is: How do I determine which "key path" or "key paths" to include in the arguments to prefetch_related? I don't seem to understand the criteria that would go into that decision. I.e. Why would I or would I not, say, include all the related "key paths" among the prefetch_related arguments?
Tried this out with below models:
class A(models.Model):
name = models.CharField(max_length=100)
my_b_set = models.ManyToManyField('B', related_name='my_a')
class B(models.Model):
name = models.CharField(max_length=100)
my_c_set = models.ManyToManyField('C', related_name='my_b')
my_d_set = models.ManyToManyField('D', related_name='my_c')
class C(models.Model):
name = models.CharField(max_length=100)
class D(models.Model):
name = models.CharField(max_length=100)
And filled the relationships like so:
a = A.objects.create(name='a1')
b1 = B.objects.create(name='b1')
b2 = B.objects.create(name='b2')
c1 = C.objects.create(name='c1')
c2 = C.objects.create(name='c2')
d1 = D.objects.create(name='d1')
d2 = D.objects.create(name='d2')
a.my_b_set.add(b1)
a.my_b_set.add(b2)
b1.my_c_set.add(c1, c2)
b1.my_d_set.add(d1, d2)
b2.my_c_set.add(c1, c2)
b2.my_d_set.add(d1, d2)
Then run this query:
A.objects.prefetch_related('my_b_set', 'my_b_set__my_c_set', 'my_b_set__my_d_set')
As expected, it made 4 queries:
-- Get all A's
SELECT "changelog_a"."id", "changelog_a"."name" FROM "changelog_a"
-- Get all the related B's of A mapped with A's id
SELECT ("changelog_a_my_b_set"."a_id") AS "_prefetch_related_val_a_id", "changelog_b"."id", "changelog_b"."name"
FROM "changelog_b" INNER JOIN "changelog_a_my_b_set" ON ("changelog_b"."id" = "changelog_a_my_b_set"."b_id")
WHERE "changelog_a_my_b_set"."a_id" IN (2)
-- Get all the related C's of B mapped with B's id
SELECT ("changelog_b_my_c_set"."b_id") AS "_prefetch_related_val_b_id", "changelog_c"."id", "changelog_c"."name"
FROM "changelog_c" INNER JOIN "changelog_b_my_c_set" ON ("changelog_c"."id" = "changelog_b_my_c_set"."c_id")
WHERE "changelog_b_my_c_set"."b_id" IN (3, 4)
-- Get all the related D's of B mapped with B's id
SELECT ("changelog_b_my_d_set"."b_id") AS "_prefetch_related_val_b_id", "changelog_d"."id", "changelog_d"."name"
FROM "changelog_d" INNER JOIN "changelog_b_my_d_set" ON ("changelog_d"."id" = "changelog_b_my_d_set"."d_id")
WHERE "changelog_b_my_d_set"."b_id" IN (3, 4)
So in this example, overlapping key paths with new relations would not repeat the queries for B, and will only create new queries for those new relations.

Using existing field values in django update query

I want to update a bunch of rows in a table to set the id = self.id. How would I do the below?
from metadataorder.tasks.models import Task
tasks = Task.objects.filter(task_definition__cascades=False)
.update(shared_task_id=self.id)
The equivalent SQL would be:
update tasks_task t join tasks_taskdefinition d
on t.task_definition_id = d.id
set t.shared_task_id = t.id
where d.cascades = 0
You can do this using an F expression:
from django.db.models import F
tasks = Task.objects.filter(task_definition__cascades=False)
.update(shared_task_id=F('id'))
There are some restrictions on what you can do with F objects in an update call, but it'll work fine for this case:
Calls to update can also use F expressions to update one field based on the value of another field in the model.
However, unlike F() objects in filter and exclude clauses, you can’t introduce joins when you use F() objects in an update – you can only reference fields local to the model being updated. If you attempt to introduce a join with an F() object, a FieldError will be raised[.]
https://docs.djangoproject.com/en/dev/topics/db/queries/#updating-multiple-objects-at-once
I stumbled upon this topic and noticed Django's limitation of updates with foreign keys, so I now use raw SQL in Django:
from django.db import connection
with connection.cursor() as cursor:
cursor.execute("UPDATE a JOIN b ON a.b_id = b.id SET a.myField = b.myField")

set minimum primary key for saving django models in database

I am new to Django.I have two models, Model A and a new Model B
class A:
firstname=models.CharField(max_length=20,blank=True,null=True)
email = models.EmailField(max_length=30,blank=True,null=True)
I have to migrate all the data from A to B in such that a way that primary key of a entry in Model A will be same in Model B. i.e
b.id = a.id where a and b are instance of A and B respectively.
but after this when i save a new instance the id generated is 5L, 6L etc. instead of incrementing the prmiary key of the last object created. Is there any way to fix this ??
I am using django 1.3 with postgresql 9.2.
You can copy data from table A into B using insert command
for example :
INSERT INTO A (id, firstname, email) SELECT b.id, b.firstname, b.email FROM B b;
Hope this will help you.

django "a contains b" on ManyToManyField between A and B

I have two tables A and B. A.bs is a ManyToManyField onto B.
I want to fetch all a in A where a.bs contains a certain b from B.
The only way I know how to do it is like this:
def get_all_A_containing_b(b):
return filter(lambda a: b in a.bs, A.objects.all())
I'd prefer to have this all done by the DBMS, but I don't want to write any SQL code or use django internals.
The SQL would look something like this: (I can't remember the semantics of JOIN and nulls so this may be wrong)
SELECT * FROM A a
LEFT JOIN A2B a2b on a2b.a_id = a.id
LEFT JOIN B b on a2b.b_id = b.id
WHERE b.id = $b;
where $b is replaced with the id of the b from B I want.
whats the problem with
as = A.objects.filter(bs=b)?
Have you tried using the reverse lookup through one of the automatic _set attributes?
b = B.objects.get( b_id)
a_list = b.a_set.all()
I am answering from my mobile so I can't test if this works.
-Justin