Basic relations in django (built on top of a legacy db) - django

I've googled on and on, and I just don't seem to get it.
How do I recreate simple join queries in django?
in models.py (Fylker is county, Dagensrepresentanter is persons)
class Fylker(models.Model):
id = models.CharField(max_length=6, primary_key=True)
navn = models.CharField(max_length=300)
def __unicode__(self):
return self.navn
class Meta:
db_table = u'fylker'
class Dagensrepresentanter(models.Model):
id = models.CharField(max_length=33, primary_key=True)
etternavn = models.CharField(max_length=300, blank=True)
fornavn = models.CharField(max_length=300, blank=True)
fylke = models.ForeignKey(Fylker, db_column='id')
def __unicode__(self):
return u'%s %s' % (self.fornavn, self.etternavn)
class Meta:
ordering = ['etternavn'] # sette default ordering
db_table = u'dagensrepresentanter'
Since the models are auto-created by django, I have added the ForeignKey and tried to connect it to the county. The id fields are inherited from the db I'm trying to integrate into this django project.
By querying
Dagensrepresentanter.objects.all()
I get all the people, but without their county.
By querying
Dagensrepresentanter.objects.all().select_related()
I get a join on Dagensrepresentanter.id and Fylker.id, but I want thet join to be on fylke, aka
SELECT * FROM dagensrepresentanter d , fylker f WHERE d.fylke = f.id
This way I'd get the county name (Fylke navn) in the same resultset as all the persons.
Additional request:
I've read over the django docs and quite a few questions here at stackoverflow, but I can't seem to get my head around this ORM thing. It's the queries that hurt. Do you have any good resources (blogposts with experiences/explanations, etc.) for people accustomed to think of databases as an SQL-thing, that needs to start thinking in django ORM terms?

Your legacy database may not have foreign key constraints (for example, if it is using MyISAM then foreign keys aren't even supported).
You have two choices:
Add foreign key constraints to your tables (would involve upgrading to Innodb if you are on MyISAM). Then run ./manage inspectdb again and the relationships should appear.
Use the tables as is (i.e., with no explicit relationships between them) and compose queries manually (e.g., Mytable.objects.get(other_table_id=23)) either at the object level or through writing your own SQL queries. Either way, you lose much of the benefit of python's ORM query language.

Related

Use of select_related in simple query in django

I have a model in Django in which a field has a fk relationship with the teacher model. I have came across select_related in django and want to use it in my view. However, I am not sure whether to use it in my query or not.
My models:
class Teacher(models.Model):
name = models.OneToOneField(max_length=255, default="", blank=True)
address = models.CharField(max_length=255, default="", blank=True)
college_name = models.CharField(max_length=255, default="", blank=True)
class OnlineClass(models.Model):
teacher = models.ForeignKey(Teacher,on_delete=models.CASCADE)
My view:
def get(self, request,*args, **kwargs):
teacher = self.request.user.teacher
classes = Class.objects.filter(teacher=teacher) #confusion is here..............
serializer_class = self.get_serializer_class()
serializer = serializer_class(classes,many=True)
return Response(serializer.data,status=status.HTTP_200_OK)
I have commented on the line or the section of the problem. So I wanted to list all the classes of that teacher. Here I have used filter. But can we use select_related here?? What I understood is if I want to show another fields of teacher model as well, for eg name or college_name, then I have to use it. Otherwise the way I have done it is correct. Also, select_related is only used for get api not for post api, is that correct??
First, the easiest way to get all classes per teacher is by using the related_name attribute (https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.ForeignKey.related_name).
class OnlineClass(models.Model):
teacher = models.ForeignKey(
Teacher,
on_delete=models.CASCADE,
related_name='classes'
)
# All classes of a teacher
teacher.classes.all()
When select_related is used, new sql joins are added to the Django internals SQL query. It is useful to reduce the workload in the database engine, getting the data quickly, and yes, is only for reading.
for obj in OnlineClass.objects.all():
# This hits the database every cycle to get the teacher data,
# with a new query like: select * from teacher_table where id = ...
print(obj.teacher)
for obj in OnlineClass.objects.select_related('teacher').all():
# This don'ts hits the database.
# Previously, the Django ORM joined the
# OnlineClass and Teacher data with a single SQL query.
print(obj.teacher)
I think that, in your example, with only one teacher, using "select_related" or not don't make big difference.
select_related is used to select additional data from related objects when the query is executed. It results in a more complex query. But it boosts performance if you have to access related data, since no additional database queries will be required.
See documentation here.
In your code it would be possible to use select_related, but it would be inefficient, because you're not accessing related objects of the queried classes. So using select_related would result in a more complex query without any advantage.
If you wanted to use select_related, the syntax would be classes = Class.objects.select_related('teacher').filter(teacher=teacher)

select_related with reverse foreign keys

I have two Models in Django. The first has the hierarchy of what job functions (positions) report to which other positions, and the second is people and what job function they hold.
class PositionHierarchy(model.Model):
pcn = models.CharField(max_length=50)
title = models.CharField(max_length=100)
level = models.CharField(max_length=25)
report_to = models.ForeignKey('PositionHierachy', null=True)
class Person(model.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
...
position = models.ForeignKey(PositionHierarchy)
When I have a Person record and I want to find the person's manager, I have to do
manager = person.position.report_to.person_set.all()[0]
# Can't use .first() because we haven't upgraded to 1.6 yet
If I'm getting people with a QuerySet, I can join (and avoid a second trip to the database) with position and report_to using Person.objects.select_related('position', 'position__reports_to').filter(...), but is there any way to avoid making another trip to the database to get the person_set? I tried adding 'position__reports_to__person_set' or just position__reports_to__person to the select_related, but that doesn't seem to change the query. Is this what prefetch_related is for?
I'd like to make a custom manager so that when I do a query to get Person records, I also get their PositionHeirarchy and their manager's Person record without more round trips to the database. This is what I have so far:
class PersonWithManagerManager(models.Manager):
def get_query_set(self):
qs = super(PersonWithManagerManager, self).get_query_set()
return qs.select_related(
'position',
'position__reports_to',
).prefetch_related(
)
Yes, that is what prefetch_related() is for. It will require an additional query, but the idea is that it will get all of the related information at once, instead of once per Person.
In your case:
qs.select_related('position__report_to')
.prefetch_related('position__report_to__person_set')
should require two queries, regardless of the number of Persons in the original query set.
Compare this example from the documentation:
>>> Restaurant.objects.select_related('best_pizza')
.prefetch_related('best_pizza__toppings')

Modify Tables in database

I have this model
class Category(models.Model):
category_name= models.CharField(max_length=255)
sub_category_ID = models.IntegerField(null=True, blank=True)
def __unicode__(self):
return self.category_name
I already have data in the table but I want to change the sub_category_ID to without deleting the entire database.
class Category(models.Model):
category_name= models.CharField(max_length=255)
sub_category_ID = models.ForeignKey('self',null=True, blank=True)
def __unicode__(self):
return self.category_name
So I run syncdb after I changed the model and it gave me the warning.
The following content types are stale and need to be deleted:
uTriga | event_event_category
Any objects related to these content types by a foreign key will also
be deleted. Are you sure you want to delete these content types?
If you're unsure, answer 'no'.`
I typed yes and now am getting the error
column app_category.sub_category_ID_id does not exist
column uTriga_category.sub_category_ID_id does not exist
By default, Django syncdb mechanism doesn't allow model changes. The entire database has to be dropped and recreated. Only new models can be added between two syncb runs.
So you'd probably want to use some 'migration' tool that allows progressive evolutions of the DB schema. This is almost always a very good idea to use one IMHO.
South is the most famous for Django, and so far I've been pretty happy with it.

Django many to many recursive relationship

I'm not so great with databases so sorry if I don't describe this very well...
I have an existing Oracle database which describes an algorithim catalogue.
There are two tables algorithims and xref_alg.
Algorithims can have parents and children algorithms. Alg_Xref contains these relationships with two foreign keys - xref_alg and xref_parent.
These are the Django models I have so far from the inspectdb command
class Algorithms(models.Model):
alg_id = models.AutoField(primary_key=True)
alg_name = models.CharField(max_length=100, blank=True)
alg_description = models.CharField(max_length=1000, blank=True)
alg_tags = models.CharField(max_length=100, blank=True)
alg_status = models.CharField(max_length=1, blank=True)
...
class Meta:
db_table = u'algorithms'
class AlgXref(models.Model):
xref_alg = models.ForeignKey(Algorithms, related_name='algxref_alg' ,null=True, blank=True)
xref_parent = models.ForeignKey(Algorithms, related_name='algxref_parent', null=True, blank=True)
class Meta:
db_table = u'alg_xref'
On trying to query AlgXref I encounter this:
DatabaseError: ORA-00904: "ALG_XREF"."ID": invalid identifier
So the error seems to be that it looks for a primary key ID which isn't in the table.. I could create one but seems a bit pointless. Is there anyway to get around this? Or change my models?
EDIT: So after a bit of searching it seems that Django requires a model to have a primary key. Life is too short so have just added a primary key. Will this have any impact on performance?
This is currently a limitation of the ORM provided by Django. Each model has to have one field marked as primary_key=True, if there isn't one, the framework automatically creates an AutoField with name id.
However, this is being worked on as we speak as part of this year's Google Summer of Code and hopefully will be in Django by the end of this year. For now you can try to use the fork of Django available at https://github.com/koniiiik/django which contains an implementation (which is not yet complete but should be sufficient for your purposes).
As for whether there is any benefit or not, that depends. It certainly makes the database more reusable and causes less headaches if you just add an auto incrementing id column to each table. The performance impact shouldn't be too high, the only thing you might notice is that if you have a many-to-many table like this, containing only two ForeignKey columns, adding a third one will increase its size by one half. That should, however, be irrelevant as long as you don't store billions of rows in that table.

Better way to query this Django Model

I am new to Django. I have a model and now I need to query the model.
The model is this:
from django.db import models
import datetime
class Position(models.Model):
website_position = models.IntegerField()
category = models.ForeignKey('Category')
directorio = models.ForeignKey('Directorio')
class Category(models.Model):
n_category = models.CharField(max_length=100)
category_position = models.IntegerField()
date_inserted = models.DateTimeField(auto_now_add=True)
date_last_update = models.DateTimeField(auto_now_add=True)
class Directorio(models.Model):
website_name = models.CharField(max_length=200)
website_url = models.URLField()
activated = models.BooleanField()
date_inserted = models.DateTimeField(auto_now_add=True)
date_last_update = models.DateTimeField(auto_now=True)
categories = models.ManyToManyField(Category, through=Position)
Now, I need to query the database in this way:
select dc.n_category, dp.website_position, dd.website_name, dd.website_url, dd.activated
from directorio_category as dc
join directorio_position as dp on dc.id = dp.category_id
join directorio_directorio as dd on dp.directorio_id = dd.id
where dd.activated = 'true'
Question: Should I use the Model Query API or should I use RAW SQL?
Best Regards,
You can do it with the model query api, but if you already have the raw sql written that's going to be more efficient in the long term if you have to scale to enormous heights.
Pro of raw SQL: More efficiently hit the database.
Pro of query api: Non SQL django people will be able to maintian and extend your code in the future.
I've been interacting with databases via django's orm so long that I'm struggling to figure out what your query even means.
#I think this gets what you want
positions = Position.objects.filter(directorio__activated=True).order_by('category__n_catgory', 'website_position')
for position in positions:
print position.category.n_category, position.website_position, position.directorio.website_name, position.website.website_url, position.website.activated
The key to migrating from SQL to django's ORM is starting to think in terms of the primary object(s) you want, then walking the relationships to get the related data. All data related to an object is available to you via object/dot notation, and thanks to the django ORM's laziness, most of it isn't retrieved until you ask for it.
The above code gets every position by category, then walks the relationships to get the other data. Another version of the code might get every active website then show its categories and positions:
websites = Directorio.objects.filter(activated=True).order_by('website_name')
for website in websites:
print website.website_name, website.website_url, website.activated
for position in website.position_set.all().order_by('category__n_category'):
print " -- ", position.category.n_category, position.website_position