Django How can I pass a subquery in LEFT JOIN - django

I have 3 models (A,B,C):
class A(models.Model):
url = models.URLField()
uuid = models.UUIDField()
name = models.CharField(max_length=400)
id = models.IntegerField()
class B(models.Model):
user = models.ForeignKey(C, to_field='user_id',
on_delete=models.PROTECT,)
uuid = models.ForeignKey(A, to_field='uuid',
on_delete=models.PROTECT,)
and I want to perform the following SQL query using the Django ORM:
SELECT A.id, COUNT(A.id), COUNT(foo.user)
FROM A
LEFT JOIN (SELECT uuid, user FROM B where user = '<a_specific_user_id>') as foo
ON A.uuid = foo.uuid_id
WHERE name = '{}'
GROUP by 1
HAVING COUNT(A.id)> 1 AND COUNT(A.id)>COUNT(foo.user)
My problem is mainly with LEFT JOIN. I know I can form a LEFT JOIN by checking for the existence of null fields on table B:
A.objects.filter(name='{}', b__isnull=True).values('id', 'name')
but how can I LEFT JOIN on the specific sub-query I want?
I tried using Subquery() but it seems to populate the final WHERE statement and not pass my custom sub-query in the LEFT JOIN.

For anyone stumbling upon this in the future. I directly contacted the Django irc channel and it's confirmed that, as of now, it's not possible to include a custom subquery in a LEFT JOIN clause, using the Django ORM.

Related

Django FULL OUTER JOIN

I have these three tables
class IdentificationAddress(models.Model):
id_ident_address = models.AutoField(primary_key=True)
ident = models.ForeignKey('Ident', models.DO_NOTHING, db_column='ident')
address = models.TextField()
time = models.DateTimeField()
class Meta:
managed = False
db_table = 'identification_address'
class IdentC(models.Model):
id_ident = models.AutoField(primary_key=True)
ident = models.TextField(unique=True)
name = models.TextField()
class Meta:
managed = False
db_table = 'ident_c'
class location(models.Model):
id_ident_loc = models.AutoField(primary_key=True)
ident = models.ForeignKey('IdentC', models.DO_NOTHING, db_column='ident')
loc_name = models.TextField()
class Meta:
managed = False
db_table = 'location
I want to get the last
address field (It could be zero) from IdentificationAddress model, the last _loc_name_ field (it matches at least one) from location model, name field (Only one) from IdentC model and ident field. The search is base on ident field.
I have been reading about many_to_many relationships and prefetch_related. But, they don't seem to be the best way to get these information.
If a use SQL syntax, this instruction does the job:
SELECT ident_c.name, ident_c.ident, identification_address.address, location.loc_name FROM identn_c FULL OUTER JOIN location ON ident_c.ident=location.ident FULL OUTER JOIN identification_address ON ident_c.ident=identification_address.ident;
or for this case
SELECT ident_c.name, ident_c.ident, identification_address.address, location.loc_name FROM identn_c LEFT JOIN location ON ident_c.ident=location.ident LEFT JOIN identification_address ON ident_c.ident=identification_address.ident;
Based on my little understanding of Django, JOIN instructions cannot be implemented. Hope I am wrong.
Django ORM take care of it if you set relationship between models.
for example,
models.py
class Aexample(models.Model):
name = models.CharField(max_length=20)
class Bexample(models.Model):
name = models.CharField(max_length=20)
fkexample = models.ForeignKey(Aexample)
shell
examplequery = Bexample.objects.filter(fkexample__name="hellothere")
SQL query
SELECT
"yourtable_bexample"."id",
"yourtable_bexample"."name",
"yourtable_bexample"."fkexample_id"
FROM "yourtable_bexample"
INNER JOIN "yourtable_aexample"
ON ("yourtable_bexample"."fkexample_id" = "yourtable_aexample"."id")
WHERE "yourtable_aexample"."name" = hellothere
you want to make query in Django like below
SELECT ident_c.name, ident_c.ident, identification_address.address, location.loc_name
FROM identn_c
LEFT JOIN location ON ident_c.ident=location.ident
LEFT JOIN identification_address ON ident_c.ident=identification_address.ident;
It means you want all rows from identn_c, right?. If you make proper relationship between your tables for your purpose, Django ORM takes care of it.
class IntentC(model.Model):
exampleA = models.ForeignKey(ExampleA)
exampleB = models.ForeignKey(ExampleB)
this command make query with JOIN Clause.
identn_instance = IdentC.objects.get(id=somenumber)
identn_instance.exampleA
identn_instance.exampleB
you can show every IntentC rows and relating rows in different tables.
for in in IntentC.objects.all(): #you can all rows in IntentC
print(in.exampleA.name)
#show name column in exampleA table
#JOIN ... ON intenctctable.example_id = exampleatable.id
print(in.exampleB.name) #show name column in exampleB table / JOIN ... ON

Django prefetch_related - filter with or-clause from different tables

I have a model with simple relation
class Tasks(models.Model):
initiator = models.ForeignKey(User, on_delete = models.CASCADE)
class TaskResponsiblePeople(models.Model):
task = models.ForeignKey('Tasks')
auth_user = models.ForeignKey(User)
And I need to write an analogue of an SQL query as follows:
select a.initiator, b.auth_user
from Tasks a
inner join TaskResponsiblePeople b
on TaskResponsiblePeople.task_id = task.id
where Tasks.initiator = 'value A' OR TaskResponsiblePeople.auth_user = 'value B'
The problem is that the OR statement deals with two different tables and I've got no idea about the right Django syntax to mimique the above-stated raw-SQL query. Help me out please !
UPDATE 1
According to the below-stated answer, I use the following code:
people = TaskResponsiblePeople.objects.filter(Q(task__initiator = request.user.id)|Q(auth_user = request.user.id)).select_related('auth_user')
print people.query
# The result of the print copy-pasted from console
# SELECT * FROM `task_responsible_people`
# LEFT OUTER JOIN `tasks` ON (`task_responsible_people`.`task_id` = `tasks`.`id`)
# LEFT OUTER JOIN `auth_user` T4
# ON (`task_responsible_people`.`auth_user_id` = T4.`id`)
# WHERE (`tasks`.`initiator_id` = 7 OR
# 'task_responsible_people`.`auth_user_id` = 7)
tasks = Tasks.objects.prefetch_related(
Prefetch('task_responsible_people', queryset=people, to_attr='people'))
However, in the final resultset I can still see records where neither initiator nor auth_user are equal to request.user (equal to 7 in this case)
I avoid using ".values" because of the potential need to serialize and transform the queryset into json.
I think you can do it this way if you just want those specific columns:
from django.db.models import Q
qs = Tasks.objects.filter(Q(initiator=userA) | Q(taskresponsiblepeople__auth_user=userB))\
.values('initiator', 'taskresponsiblepeople__auth_user')
To examine the generated query you can look at:
print(qs.query)
I don't have the models in my database but it should generate a query similar to following:
SELECT "tasks"."initiator_id", "taskresponsiblepeople"."auth_user_id"
FROM "tasks" LEFT OUTER JOIN "taskresponsiblepeople"
ON ( "tasks"."id" = "taskresponsiblepeople"."tasks_id" )
WHERE ("tasks"."initiator_id" = userA_id
OR "taskresponsiblepeople"."auth_user_id" = userB_id))

How can I order by ForeignKey field itself in django without join?

So I have a model MyModel with a ForeignKey field fkfield. And i need to do something like this (simplified):
MyModel.objects.values_list('id', 'fkfield').order_by('fkfield')
For example I want to groupby them further by fkfield so I need my objects to be sorted by this field. And the only thing I will use later is fkfield_id. I mean I dont need any data from related model.
But django performs a join sql query (as described in docs) and uses related model's ordering. The same happens if i explicitly try to order by id:
MyModel.objects.values_list('id', 'fkfield').order_by('fkfield__id')
and I get:
SELECT `mymodel`.`id`,
`mymodel`.`fkfield_id`
FROM `mymodel`
LEFT OUTER JOIN `related_table`
ON ( `mymodel`.`fkfield_id` = `related_table`.`id` )
ORDER BY
`related_table`.`id` ASC
What i really expect is:
SELECT `id`,
`fkfield_id`
FROM `mymodel`
ORDER BY
`fkfield_id` ASC
But I can't find a way to do it. .order_by('fkfield_id') raises exception that says that there is no such a field.
I managed to get things work using extra but I can't understand why such a simple and obvious behaviour can't be used without hacks. Or maybe i missed smth?
UPDATE: models.py
class Producer(models.Model):
name = models.CharField(max_length=100)
class Meta:
ordering = ('name',)
class Collection(models.Model):
name = models.CharField(max_length=100)
producer = models.ForeignKey('Producer')
class Meta:
ordering = ('name',)
print Collection.objects.values_list('producer', 'id').order_by('producer').query
>>> SELECT `catalog_collection`.`producer_id`, `catalog_collection`.`id`
>>> FROM `catalog_collection`
>>> INNER JOIN `catalog_producer` ON
>>> (`catalog_collection`.`producer_id` = `catalog_producer`.`id`)
>>> ORDER BY `catalog_producer`.`name` ASC
Try
.order_by('fkfield')
My query is
Post.objects.values('author', 'id').order_by('author')
as sql:
SELECT "blogs_post"."author_id",
"blogs_post"."id"
FROM "blogs_post"
ORDER BY "blogs_post"."author_id" ASC
UPDATE
Kind of messy solution:
MyModel.objects.extra(select={'fkfield_id': 'fkfield_id'})\
.values_list('id', 'fkfield_id')\
.order_by('fkfield_id')

Subquery in select Django

Trying to run a complicated query in Django over Postgresql.
These are my models:
class Link(models.Model):
short_key = models.CharField(primary_key=True, max_length=8, unique=True, blank=True)
long_url = models.CharField(max_length=150)
class Stats_links_ads(models.Model):
link_id = models.ForeignKey(Link, related_name='link_viewed', primary_key=True)
ad_id = models.ForeignKey(Ad, related_name='ad_viewed')
views = models.PositiveIntegerField()
clicks = models.PositiveIntegerField()
I want to run using the Django ORM a query which will translate into something like so:
select a.link_id, sum(a.clicks), sum (a.views), (select long_url from links_link b where b.short_key = a.link_id_id)
from links_stats_links_ads a
group by a.link_id_id;
If i exclude the long_url field that I need I can run this code and it will work:
Stats_links_Ads.objects.all().values('link_id').annotate(Sum('views'), Sum('clicks'))
I don't know how to add the subquery in the select statement.
Thanks
You can see the raw sql behind your queries using the query attribute of Queryset.
For example, look at the sql behind my first answer using select_related, it's clear the generated sql doesn't behave as expected and accessing the long_url will result in additional queries.
Take 2
You can follow relationships using double underscore notation like this
qs = Stats_links_ads.objects
.values('link_id', 'link_id__long_url')
.annotate(Sum('views'), Sum('clicks'))
str(qs.query)
'SELECT
"stackoverflow_stats_links_ads"."link_id_id",
"stackoverflow_link"."long_url",
SUM("stackoverflow_stats_links_ads"."clicks") AS "clicks__sum",
SUM("stackoverflow_stats_links_ads"."views") AS "views__sum"
FROM "stackoverflow_stats_links_ads"
INNER JOIN "stackoverflow_link"
ON ("stackoverflow_stats_links_ads"."link_id_id" = "stackoverflow_link"."short_key")
GROUP BY
"stackoverflow_stats_links_ads"."link_id_id",
"stackoverflow_link"."long_url"'
I'm not working with any data, so I haven't verified it, but the sql looks right.
Take 1
Does not work
Can't you use .select_related? [docs]
qs = Stats_links_Ads.objects.select_related('link')
.values('link_id').annotate(Sum('views'), Sum('clicks'))
str(qs.query)
'SELECT
"stackoverflow_stats_links_ads"."link_id_id",
SUM("stackoverflow_stats_links_ads"."clicks") AS "clicks__sum",
SUM("stackoverflow_stats_links_ads"."views") AS "views__sum"
FROM "stackoverflow_stats_links_ads"
GROUP BY "stackoverflow_stats_links_ads"."link_id_id"'

Django JOIN query without foreign key

Is there a way in Django to write a query using the ORM, not raw SQL that allows you to JOIN on another table without there being a foreign key? Looking through the documentation it appears in order for the One to One relationship to work there must be a foreign key present?
In the models below I want to run a query with a JOIN on UserActivity.request_url to UserActivityLink.url.
class UserActivity(models.Model):
id = models.IntegerField(primary_key=True)
last_activity_ip = models.CharField(max_length=45L, blank=True)
last_activity_browser = models.CharField(max_length=255L, blank=True)
last_activity_date = models.DateTimeField(auto_now_add=True)
request_url = models.CharField(max_length=255L, blank=True)
session_id = models.CharField(max_length=255L)
users_id = models.IntegerField()
class Meta:
db_table = 'user_activity'
class UserActivityLink(models.Model):
id = models.IntegerField(primary_key=True)
url = models.CharField(max_length=255L, blank=True)
url_description = models.CharField(max_length=255L, blank=True)
type = models.CharField(max_length=45L, blank=True)
class Meta:
db_table = 'user_activity_link'
The link table has a more descriptive translation of given URLs in the system, this is needed for some reporting the system will generate.
I've tried creating the foreign key from UserActivity.request_url to UserActivityLink.url but it fails with the following error: ERROR 1452: Cannot add or update a child row: a foreign key constraint fails
No, there isn't an effective way unfortunately.
The .raw() is there for this exact thing. Even if it could it probably would be a lot slower than raw SQL.
There is a blogpost here detailing how to do it with query.join() but as they themselves point out. It's not best practice.
Just reposting some related answer, so everyone could see it.
Taken from here: Most efficient way to use the django ORM when comparing elements from two lists
First problem: joining unrelated models
I'm assuming that your Model1 and Model2 are not related,
otherwise you'd be able to use Django's related objects
interface. Here are two approaches you could take:
Use extra and a SQL subquery:
Model1.objects.extra(where = ['field in (SELECT field from myapp_model2 WHERE ...)'])
Subqueries are not handled very efficiently in some databases
(notably MySQL) so this is probably not as good as #2 below.
Use a raw SQL query:
Model1.objects.raw('''SELECT * from myapp_model1
INNER JOIN myapp_model2
ON myapp_model1.field = myapp_model2.field
AND ...''')
Second problem: enumerating the result
Two approaches:
You can enumerate a query set in Python using the built-in enumerate function:
enumerate(Model1.objects.all())
You can use the technique described in this answer to do the enumeration in MySQL. Something like this:
Model1.objects.raw('''SELECT *, #row := #row + 1 AS row
FROM myapp_model1
JOIN (SELECT #row := 0) rowtable
INNER JOIN myapp_model2
ON myapp_model1.field = myapp_model2.field
AND ...''')
The Django ForeignKey is different from SQL ForeignKey. Django ForeignKey just represent a relation, it can specify whether to use database constraints.
Try this:
request_url = models.ForeignKey(UserActivityLink, to_field='url_description', null=True, on_delete=models.SET_NULL, db_constraint=False)
Note that the db_constraint=False is required, without it Django will build a SQL like:
ALTER TABLE `user_activity` ADD CONSTRAINT `xxx` FOREIGN KEY (`request_url`) REFERENCES `user_activity_link` (`url_description`);"
I met the same problem, after a lot of research, I found the above method.
Hope it helps.