How to convert select_related queries in Django into a single dataframe - django

I am working in Django and I would like to simply merge three tables into one dataset. My models look like so and I have tried variants of the following command:
class Quadrat(models.Model):
quadrat_number = models.IntegerField()
class Capture(models.Model):
quadrat= models.ForeignKey(Quadrat, on_delete=models.DO_NOTHING)
capture_total = models.IntegerField()
class Species(models.Model):
capture = models.ForeignKey(Capture, on_delete=models.DO_NOTHING)
species = models.CharField(max_length=50)
data = pd.DataFrame(list(Species.objects.select_related('Capture_set', 'Quadrat_set').all()
I get an attribute error: Cannot find 'Capture_set' on Species object, 'Capture_set' is an invalid parameter to select_related()
I am probably using select_related wrong.
Any ideas ?

Related

Get multiple columns data from extra query in Django

I have two models as below:
class Model1(models.model):
id = models.UUIDField(default=uuid.uuid4, primary_key=True)
filename = models.CharField(max_length=255)
class Model2(models.model):
id = models.UUIDField(default=uuid.uuid4, primary_key=True)
filename = models.CharField(max_length=255)
I would like to get the related model2 which have the same column value in filename as that of model1.
My solution was to use either Subquery or Extra. The problem with Subquery is that it allows only one column to be queried but what I want is a dict object of all the columns of model2 related to model1. They do not have a foreignkey relation to use select_related. So I tried to use extra but again I am getting the multiple columns error. How can I solve this problem?
My code is as below:
model1_objs = Model1.objects.filter(id=given_id).extra(
select={
"model2_obj": f"SELECT *
FROM model2
WHERE filename = model1.filename
AND id = '{model2_id}'"
}
)
This does not work.
This might not be as efficient as the solution you were thinking of but you could try this:
model1_obj = Model1.objects.get(id=given_id)
model2_objs = Model2.objects.filter(filename=model1_obj.filename)
Note: consider using the get() method to fetch unique objects
Have a look at this: Making queries (Django documentation)

multiple joins on django queryset

For the below sample schema
# schema sameple
class A(models.Model):
n = models.ForeignKey(N, on_delete=models.CASCADE)
d = models.ForeignKey(D, on_delete=models.PROTECT)
class N(models.Model):
id = models.AutoField(primary_key=True, editable=False)
d = models.ForeignKey(D, on_delete=models.PROTECT)
class D(models.Model):
dsid = models.CharField(max_length=255, primary_key=True)
class P(models.Model):
id = models.AutoField(primary_key=True, editable=False)
name = models.CharField(max_length=255)
n = models.ForeignKey(N, on_delete=models.CASCADE)
# raw query for the result I want
# SELECT P.name
# FROM P, N, A
# WHERE (P.n_id = N.id
# AND A.n_id = N.id
# AND A.d_id = \'MY_DSID\'
# AND P.name = \'MY_NAME\')
What am I trying to achieve?
Well, I’m trying to find a way somehow be able to write a single queryset which does the same as what the above raw query does. So far I was able to do it by writing two queryset, and use the result from one queryset and then using that queryset I wrote the second one, to get the final DB records. However that’s 2 hits to the DB, and I want to optimize it by just doing everything in one DB hit.
What will be the queryset for this kinda raw query ? or is there a better way to do it ?
Above code is here https://dpaste.org/DZg2
You can archive it using related_name attribute and functions like select_related and prefetch_related.
Assuming the related name for each model will be the model's name and _items, but it is better to have proper model names and then provided meaningful related names. Related name is how you access the model in backward.
This way, you can use this query to get all models in a single DB hit:
A.objects.all().select_related("n", "d", "n__d").prefetch_related("n__p_items")
I edited the code in the pasted site, however, it will expire soon.

How can I query embedded records in a Djongo ArrayField?

I'm using Django 3.0.6 with Django Rest Framework 3.11.1. I'm connecting to a MongoDB database using the djongo connector. One of my models has an ArrayField that contains embedded records. I would like to know how I can retrieve these embedded fields using a django query.
Each Person can have many different sub records. Here is a sample model to illustrate what I'm working on
Models:
from djongo import models
class SubRecords(models.Model):
status = models.CharField(max_length=20)
startTime = models.CharField(max_length=20)
identifier = models.CharField(max_length=20)
job_title = models.CharField(max_length=20)
class Meta:
abstract = True
class Person(models.Model):
_id = ObjectIdField()
workplace = models.CharField(max_length=120)
subject = models.CharField(max_length=120)
records = models.ArrayField(model_container=SubRecords)
I would like to query the Person model and get
all Person.records objects and
Person.records objects that match some criteria
I have tried to do this
>>> Person.objects.filter(records__exact={'job_title': 'HR'})
Now the problem I'm facing is that the result isn't limited to subrecords where the job title is HR, instead if a Person object contains a sub record that matches the criteria, the whole Person object and associated sub records are returned.
I want to be able to get a list of all the subrecords and only the subrecords that match the criteria I specify. How can I do this?
You don't need __exact:
Person.objects.filter(records={'job_title':'HR'})
I suppose you can do it like this
Person.objects.filter(records__in={'job_title':'HR'})
You can add .all() at the end of query if you want.

Getting a queryset using a foreign key field in the "other side" of a foreign key relation

Forgive me if the question does not make sense, trying to teach myself django. I've been trying to search how to do this but i'm not sure if i'm using the right words in my search.
I have the following models.
class Category(models.Model):
code = models.CharField(max_length=10, unique=True)
description = models.CharField(max_length=50)
class UserGroupHeader(models.Model):
code = models.CharField(max_length=10, unique=True)
description = models.CharField(max_length=50)
class UserGroupDetail(models.Model):
usergroupheader = models.ForeignKey(UserGroupHeader, on_delete=models.CASCADE)
category = models.ForeignKey(Category, on_delete=models.PROTECT)
How do i get a query set from the Category model using the UserGroupHeader? so far what i've got is something like this UserGroupHeader.objects.get(pk=9).usergroupdetail_set.all(), now from the result of this how do i get the Category model?
I'm not sure if I understood exactly what you are trying to do, but in general, while querying, you can follow relations using double underscores. Below are a couple of possible queries:
my_group_header = UserGroupHeader.objects.get(...)
Category.objects.filter(usergroupdetail__usergroupheader=my_group_header) # Gets Category objects related to my_group_header through UserGroupDetail model
Category.objects.filter(usergroupdetail__usergroupheader__code='abc') # Gets Category objects related to UserGroupHeader object with code 'abc' through UserGroupDetail model
UserGroupHeader.objects.filter(usergroupdetail__category__code='abc') # Gets UserGroupHeader objects related to Category object with code 'abc' through UserGroupDetail model
Your query UserGroupHeader.objects.get(pk=9).usergroupdetail_set.all() would return a QuerySet of UserGroupDetail objects. In order to get the category of each UserGroupDetail, you can:
for user_group_detail in UserGroupHeader.objects.get(pk=9).usergroupdetail_set.all():
category = user_group_detail.category
print(category. description)
Or something similar according to your needs

Better way to query this Django Model

I am new to Django. I have a model and now I need to query the model.
The model is this:
from django.db import models
import datetime
class Position(models.Model):
website_position = models.IntegerField()
category = models.ForeignKey('Category')
directorio = models.ForeignKey('Directorio')
class Category(models.Model):
n_category = models.CharField(max_length=100)
category_position = models.IntegerField()
date_inserted = models.DateTimeField(auto_now_add=True)
date_last_update = models.DateTimeField(auto_now_add=True)
class Directorio(models.Model):
website_name = models.CharField(max_length=200)
website_url = models.URLField()
activated = models.BooleanField()
date_inserted = models.DateTimeField(auto_now_add=True)
date_last_update = models.DateTimeField(auto_now=True)
categories = models.ManyToManyField(Category, through=Position)
Now, I need to query the database in this way:
select dc.n_category, dp.website_position, dd.website_name, dd.website_url, dd.activated
from directorio_category as dc
join directorio_position as dp on dc.id = dp.category_id
join directorio_directorio as dd on dp.directorio_id = dd.id
where dd.activated = 'true'
Question: Should I use the Model Query API or should I use RAW SQL?
Best Regards,
You can do it with the model query api, but if you already have the raw sql written that's going to be more efficient in the long term if you have to scale to enormous heights.
Pro of raw SQL: More efficiently hit the database.
Pro of query api: Non SQL django people will be able to maintian and extend your code in the future.
I've been interacting with databases via django's orm so long that I'm struggling to figure out what your query even means.
#I think this gets what you want
positions = Position.objects.filter(directorio__activated=True).order_by('category__n_catgory', 'website_position')
for position in positions:
print position.category.n_category, position.website_position, position.directorio.website_name, position.website.website_url, position.website.activated
The key to migrating from SQL to django's ORM is starting to think in terms of the primary object(s) you want, then walking the relationships to get the related data. All data related to an object is available to you via object/dot notation, and thanks to the django ORM's laziness, most of it isn't retrieved until you ask for it.
The above code gets every position by category, then walks the relationships to get the other data. Another version of the code might get every active website then show its categories and positions:
websites = Directorio.objects.filter(activated=True).order_by('website_name')
for website in websites:
print website.website_name, website.website_url, website.activated
for position in website.position_set.all().order_by('category__n_category'):
print " -- ", position.category.n_category, position.website_position