How can I join three Django Models to return a queryset? - django

I want to create a queryset that references three related models, and allows me to filter. The SQL might look like this:
SELECT th.id, th.customer, ft.filename, fva.path
FROM TransactionHistory th
LEFT JOIN FileTrack ft
ON th.InboundFileTrackID = ft.id
LEFT JOIN FileViewArchive fva
ON fva.FileTrackId = ft.id
WHERE th.customer = 'ACME, Inc.'
-- AND ft.filename like '%storage%' --currently don't need to do this, but seeing placeholder logic would be nice
I have three models in Django, shown below. It's a bit tricky, because the TransactionHistory model has two foreign keys to the same model (FileTrack). And FileViewArchive has a foreign key to FileTrack.
class FileTrack(models.Model):
id = models.BigIntegerField(db_column="id", primary_key=True)
filename = models.CharField(db_column="filename", max_length=128)
class Meta:
managed = False
db_table = "FileTrack"
class TransactionHistory(models.Model):
id = models.BigIntegerField(db_column="id", primary_key=True)
customer = models.CharField(db_column="Customer", max_length=128)
inbound_file_track = models.ForeignKey(
FileTrack,
db_column="InboundFileTrackId",
related_name="inbound_file_track_id",
on_delete=models.DO_NOTHING,
null=True,
)
outbound_file_track = models.ForeignKey(
FileTrack,
db_column="OutboundFileTrackId",
related_name="outbound_file_track_id",
on_delete=models.DO_NOTHING,
null=True,
)
class Meta:
managed = False
db_table = "TransactionHistory"
class FileViewArchive(models.Model):
id = models.BigIntegerField(db_column="id", primary_key=True)
file_track = models.ForeignKey(
FileTrack,
db_column="FileTrackId",
related_name="file_track_id",
on_delete=models.DO_NOTHING,
null=True,
)
path = models.CharField(db_column="Path", max_length=256)
class Meta:
managed = False
db_table = "FileViewArchive"
One thing I tried:
qs1 = TransactionHistory.objects.select_related('inbound_file_track').filter(customer='ACME, Inc.')
qs2 = FileViewArchive.objects.select_related('file_track').all()
qs = qs1 & qs2 # doesn't work b/c they are different base models
And this idea to use chain doesn't work either because it's sending two separate queries an I'm not altogether sure if/how it's merging them. I'm looking for a single query in order to be more performant. Also it returns an iterable, so I'm not sure I can use this in my view (Django Rest Framework). Lastly x below returns a TransactionHistory object, so I can't even access the fields from the other two models.
from itertools import chain
c = chain(qs1 | qs2) # great that his this lazy and doesn't evaluate until used!
type(c) # this returns <class 'itertools.chain'> and it doesn't consolidate
x = list(c)[0] # runs two separate queries
type(x) # a TransactionHistory object -> so no access to the Filetrack or FileViewArchive fields
Any ideas how I can join three models together? Something like this?:
qs = TransactionHistory.objects.select_related('inbound_file_track').select_related('file_track').filter(customer='ACME, Inc.', file_track__filename__contains='storage')
More info: this is part of a view that will look like below. It returns a querysets that is used as part of a Django Rest Framework view.
class Transaction(generics.ListAPIView):
serializer_class = TransactionSerializer
def filter_queryset(self, queryset):
query_params = self.request.query_params.copy()
company = query_params.pop("company", [])[0]
filename = query_params.pop("filename", [])[0]
# need code here that generate filtered queryset for filename and company
# qs = TransactionHistory.objects.select_related('inbound_file_track').select_related('file_track').filter(customer='ACME, Inc.', file_track__filename__contains='storage')
return qs.order_by("id")

Based from the sql query you shared, you are filtering based on the inbound_file_track file name. So something like this should work:
TransactionHistory.objects.select_related(
'inbound_file_track',
).prefetch_related(
'inbound_file_track__file_track_id',
).filter(
customer='ACME, Inc.', inbound_file_track___filename__contains='storage',
)

Related

Annotating in django using a field from another model

I'm trying to calculate a field that it's a sum agregation when two of other fields are the same. The problem is that one of this fields is from the same model as the sum field, but the other one is from another model.
models.py:
class Node(model.Model):
text = models.CharField(max_length=100, verbose_name=_('Text'))
perimeter_ID = models.CharField(max_length=100, verbose_name=_('Perimeter_ID'))
pos_x = models.FloatField(verbose_name=_('Position X'))
pos_y = models.FloatField(verbose_name=_('Position Y'))
class Meta:
ordering = ['pk']
class Routes(models.Model):
from_id = models.ForeignKey(Node, on_delete=models.PROTECT, verbose_name=_('From'), related_name='transport_from')
to_id = models.ForeignKey(Node, on_delete=models.PROTECT, verbose_name=_('To'), related_name='transport_to')
n_box = models.FloatField(verbose_name=_('Number of boxes'))
day = models.CharField(max_length=10, verbose_name=_('day'))
class Meta:
ordering = ['from_id__id', 'to_id__id']
And in my views, I'm first filtering by day as the request will be something like filter/?day=20220103 and then I want to know the sum of boxes when from_id and perimeter_id are the same (the perimeter_id of the node corresponding to to_id). So, I need somehow to make the relation between, the to_id node and its perimeter_id.
views.py:
class sumByPerimeterListAPIView(ListAPIView):
queryset = Routes.objects.all()
filter_backends = [DjangoFilterBackend]
filter_fields = {
'day': ["in", "exact"],
}
def get(self, request, **kwargs):
queryset = self.get_queryset()
filter_queryset = self.filter_queryset(queryset)
values = filter_queryset.values('from_id')\ # TODO: INCLUDE perimeter_id OF to_id
.annotate(n_box=Sum('n_box'))
return Response(values)
I've been reading about subquery and OuterRef in Django, also in the following link: Django 1.11 Annotating a Subquery Aggregate. But these examples are not valid for me as I don't need to annotate the field perimeter_id. I need to agrupate by from_id (model Routes) and perimeter_id (model Node) and annotate by n_box.
To get a conditioned annotation, try using the case when statement inside the annotation:
Below is an example of it
from django.db.models import Count, Case, When, IntegerField
Article.objects.annotate(
numviews=Count(Case(
When(readership__what_time__lt=treshold, then=1),
output_field=IntegerField(),
))
)
In your case you have to change the below line as follows:
values = filter_queryset.annotate(n_box=Sum(When(Case (from_id=from_id.perimeter_ID, then=n_box), n_box=DecimalField())
Try something similar using when, case, and then.
I hope it will solve your conditioning issues.
Also, have a look at F expressions. They are very useful when it comes to conditioning.

multiple joins on django queryset

For the below sample schema
# schema sameple
class A(models.Model):
n = models.ForeignKey(N, on_delete=models.CASCADE)
d = models.ForeignKey(D, on_delete=models.PROTECT)
class N(models.Model):
id = models.AutoField(primary_key=True, editable=False)
d = models.ForeignKey(D, on_delete=models.PROTECT)
class D(models.Model):
dsid = models.CharField(max_length=255, primary_key=True)
class P(models.Model):
id = models.AutoField(primary_key=True, editable=False)
name = models.CharField(max_length=255)
n = models.ForeignKey(N, on_delete=models.CASCADE)
# raw query for the result I want
# SELECT P.name
# FROM P, N, A
# WHERE (P.n_id = N.id
# AND A.n_id = N.id
# AND A.d_id = \'MY_DSID\'
# AND P.name = \'MY_NAME\')
What am I trying to achieve?
Well, I’m trying to find a way somehow be able to write a single queryset which does the same as what the above raw query does. So far I was able to do it by writing two queryset, and use the result from one queryset and then using that queryset I wrote the second one, to get the final DB records. However that’s 2 hits to the DB, and I want to optimize it by just doing everything in one DB hit.
What will be the queryset for this kinda raw query ? or is there a better way to do it ?
Above code is here https://dpaste.org/DZg2
You can archive it using related_name attribute and functions like select_related and prefetch_related.
Assuming the related name for each model will be the model's name and _items, but it is better to have proper model names and then provided meaningful related names. Related name is how you access the model in backward.
This way, you can use this query to get all models in a single DB hit:
A.objects.all().select_related("n", "d", "n__d").prefetch_related("n__p_items")
I edited the code in the pasted site, however, it will expire soon.

Django all related data

class Docs(models.Model):
doc_id = models.BigIntegerField(primary_key=True)
journal = models.CharField(max_length=50, blank=True, null=True)
year = models.IntegerField(blank=True, null=True)
class Meta:
managed = False
db_table = 'docs'
class Assays(models.Model):
assay_id = models.BigIntegerField(primary_key=True)
doc = models.ForeignKey('Docs', models.DO_NOTHING)
description = models.CharField(max_length=4000, blank=True, null=True)
class Meta:
managed = False
db_table = 'assays'
class Activities(models.Model):
activity_id = models.BigIntegerField(primary_key=True)
assay = models.ForeignKey(Assays, models.DO_NOTHING)
doc = models.ForeignKey(Docs, models.DO_NOTHING, blank=True, null=True)
record = models.ForeignKey('CompoundRecords', models.DO_NOTHING)
class Meta:
managed = False
db_table = 'activities'
I apologize in advance if this answer is easily found elsewhere. I have searched all over and do not see a simple way to query my data as intuitively as I feel like should be possibe.
These are classes for 3 tables. The actual dataset is closer to 100 tables. Each doc_id can have one or many associated activity_ids. Each activity_id is associated with one assay_id.
My goal is to obtain all of the related data for each of the activities in a single doc. For instance:
query_activities_values = Docs.objects.get(doc_id=5535).activities_set.values()
for y in query_activities_values:
print(y)
break
>>> {'activity_id': 753688, 'assay_id': 158542, 'doc_id': 5535, .....
This returns 32 dictionaries (only part of the first is shown) for columns in the Activities table that have doc_id=5535. I would like to go one step further and also automatically pull in all of the data from the Assays table that is associated with the corresponding assay_id for each dictionary.
I can access that Assay data through a similar query, but only by stating each field explicitly:
query_activities_values = Docs.objects.get(doc_id=5535).activities_set.values('assay', 'assay__assay_type', 'assay__description')
for y in query_activities_values:
print(y)
break
I would like a single query that finds not only the assay and associated assay data for one activity_id, but finds all data and associated data for the 90+ other tables associated in the model
Thank you
Update 1
I did find this code that works surprisingly well for my needs, however, I was curious if this is the best method:
from django.forms.models import model_to_dict
def serial_model(modelobj):
opts = modelobj._meta.fields
modeldict = model_to_dict(modelobj)
for m in opts:
if m.is_relation:
foreignkey = getattr(modelobj, m.name)
if foreignkey:
try:
modeldict[m.name] = serial_model(foreignkey)
except:
pass
return modeldict
That's not too much code, but I thought there may be a more built-in way to do this.
What you need is prefetch_related:
Django 2.2 Prefetch Related Docs
query_activities_values = Docs.objects.get(doc_id=5535).activities_set.values()
Would become:
query_activities_values = Docs.objects.prefetch_related(models.Prefetch("activities_set", to_attr="activities"), models.Prefetch("assays_set", to_attr="assays")).get(doc_id=5535)
A new attributes will be created called "activities" and "assays" which you can use to retrieve data.
One more thing. This isn't actually 1 query. It's 3. However, if you're getting more than just one object from Docs, it's still going to be 3.
Also, is there a reason why you're using BigIntegerField?

Query a many to many field of referenced table using ORM

My models are as follows:
class AppUser(models.Model):
id = models.AutoField(primary_key=True)
user = models.OneToOneField(User)
states = models.ManyToManyField(State)
class ABC(models.Model):
id = models.AutoField(primary_key=True)
name = models.CharField(max_length=50)
email = models.EmailField()
app_user = models.ForeignKey(AppUser, null=True, blank=True)
I want to query my database for list of objects present in ABC model and I want to filter it according to the list of States.
I am trying something like this:
ABC.objects.filter(app_user__states__in = state_list).values('id','name')
But this is not working. Can I even access a many to many field like this or do I need to create a custom through table.
Yes, you can.
For queryset:
ABC.objects.filter(app_user__states__in = [1,2]).values('id', 'name')
you'll get sql like this:
>>> print(ABC.objects.filter(app_user__states__in = [1,2]).values('id', 'name').query)
SELECT "test_abc"."id", "test_abc"."name"
FROM "test_abc"
INNER JOIN "test_appuser" ON ("test_abc"."app_user_id" = "test_appuser"."id")
INNER JOIN "test_appuser_states" ON ("test_appuser"."id" = "test_appuser_states"."appuser_id")
WHERE "test_appuser_states"."state_id" IN (1, 2);
Looks fine. Maybe it doesn't work as you expected?

Django admin list_display reverse to parent

I have 2 models:
1: KW (individual keywords)
2: Project (many keywords can belong to many different projects)
class KW(models.Model):
...
project = models.ManyToManyField('KWproject', blank=True)
class KWproject(models.Model):
ProjectKW = models.CharField('Main Keyword', max_length=1000)
author = models.ForeignKey(User, editable=False)
Now when user is in Admin for KWproject they should be able to see all keywords belonging to selected project in list_display. I achieved this but it doesn't feel like proper way.
class ProjectAdmin(admin.ModelAdmin):
form = ProjectForm
list_display = ('Keywordd', 'author')
def Keywordd(self, obj):
return '%s' % (obj.id, obj.ProjectKW)
Keywordd.allow_tags = True
Keywordd.admin_order_field = 'ProjectKW'
Keywordd.short_description = 'ProjectKW'
Is there better way to link and then list_display all items that have reverse relationship to the model? (via "project" field in my example)
As per the Django Admin docs:
ManyToManyField fields aren’t supported, because that would entail
executing a separate SQL statement for each row in the table. If you
want to do this nonetheless, give your model a custom method, and add
that method’s name to list_display. (See below for more on custom
methods in list_display.)
So, you may opt to implement a custom model method like so:
# models.py
class KW(models.Model):
...
project = models.ManyToManyField('KWproject', blank=True)
class KWproject(models.Model):
ProjectKW = models.CharField('Main Keyword', max_length=1000)
author = models.ForeignKey(User, editable=False)
def all_keywords(self):
# Retrieve your keywords
# KW_set here is the default related name. You can set that in your model definitions.
keywords = self.KW_set.values_list('desired_fieldname', flat=True)
# Do some transformation here
desired_output = ','.join(keywords)
# Return value (in example, csv of keywords)
return desired_output
And then, add that model method to your list_display tuple in your ModelAdmin.
# admin.py
class ProjectAdmin(admin.ModelAdmin):
form = ProjectForm
list_display = ('Keywordd', 'author', 'all_keywords')
def Keywordd(self, obj):
return '%s' % (obj.id, obj.ProjectKW)
Keywordd.allow_tags = True
Keywordd.admin_order_field = 'ProjectKW'
Keywordd.short_description = 'ProjectKW'
Do take note: This can potentially be a VERY EXPENSIVE operation. If you are showing 200 rows in the list, then a request to the page will execute 200 additional SQL queries.